Introduction to Neural Networks: Bank Churn prediction¶
Problem Statement¶
Context¶
Businesses like banks which provide service have to worry about problem of 'Customer Churn' i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a customer's decision in this regard. Management can concentrate efforts on improvement of service, keeping in mind these priorities.
Objective¶
You as a Data scientist with the bank need to build a neural network based classifier that can determine whether a customer will leave the bank or not in the next 6 months.
Data Dictionary:¶
CustomerId: Unique ID which is assigned to each customer
Surname: Last name of the customer
CreditScore: It defines the credit history of the customer.
Geography: A customer’s location
Gender: It defines the Gender of the customer
Age: Age of the customer
Tenure: Number of years for which the customer has been with the bank
NumOfProducts: refers to the number of products that a customer has purchased through the bank.
Balance: Account balance
HasCrCard: It is a categorical variable which decides whether the customer has credit card or not.
EstimatedSalary: Estimated salary
isActiveMember: Is is a categorical variable which decides whether the customer is active member of the bank or not ( Active member in the sense, using bank products regularly, making transactions etc )
Exited : whether or not the customer left the bank within six month. It can take two values ** 0=No ( Customer did not leave the bank ) ** 1=Yes ( Customer left the bank )
Steps or Approach to Study¶
Step 1: Importing the Libraries¶
To begin with, we need to import the necessary libraries and dependencies that we will be using throughout the project. These include pandas, NumPy, matplotlib, seaborn, scikit-learn, and TensorFlow
Step 2: Data Analysis & Visualization¶
Next, we need to analyze and visualize the dataset to gain insights into the dataset and understand the variables that may contribute to customer churn.
Step 3: Apply One Hot Encoding Technique on Gender & Geography Column¶
Since our dataset contains categorical variables like “Gender” and “Geography,” we need to convert them into numerical format for the ANN model to process. We apply the one-hot encoding technique to transform these categorical columns into binary columns, representing different categories
Step 4: Handle the Imbalanced Target Column and Balance It¶
Customer churn datasets often suffer from class imbalance, where the number of churners is significantly smaller than non-churners. To address this issue, we will balance the target column by upsampling the minority class. This ensures that the model does not get biased towards the majority class during training
Step 5: Splitting the Dataset into Training and Testing Sets¶
Before training our ANN model, we need to preprocess the data by separating the features and the target variable. we split the dataset into training and testing sets. The training set is used to train the model on historical data, while the testing set is used to evaluate the model’s performance on unseen data. Typically, around 70–80% of the data is used for training, and the remaining 20–30% is used for testing
Step 6: Feature Scaling - Sklearn Standard Scaler Technique¶
Feature scaling is an essential preprocessing step for ANN models. It standardizes the range of input features, ensuring that they have similar scales. This step prevents certain features from dominating others during the training process. We apply feature scaling using techniques such as Sklearn Standard Scaler Technique for Standardized the Input Features Values.
Step 7: Build the Artificial Neural Network (ANN) Model¶
In this Step, Now it’s time to construct the architecture of the ANN model using the Keras library, which is a high-level API for building neural networks in TensorFlow. The model consists of multiple layers of interconnected neurons, including input, hidden, and output layers. We define the number of neurons in each layer, activation functions, and other parameters. The choice of the model architecture depends on the complexity of the problem and the available computational resources.
Step 8: Training the ANN Model¶
Once the model architecture is defined, we train the ANN model using the training dataset. During training, the model adjusts its internal parameters (weights and biases) to minimize the difference between the predicted churn outcomes and the actual outcomes. We specify the number of epochs (iterations) and the batch size for training. Monitoring metrics such as accuracy, loss, and validation metrics help assess the model’s performance during training.
Step 9: Visualize the Loss & Accuracy of ANN Model¶
Finally, we can visualize the model’s performance by plotting the training and validation accuracy and loss over the epochs
Step 10: Evaluating the ANN Model & Predicting Churn for New Customers¶
After training, we evaluate the performance of the ANN model using the testing dataset. We calculate various evaluation metrics such as accuracy, precision, recall, and F1 score to assess how well the model predicts customer churn.
Step 11: Predict the Churn and generate prediction score¶
Utilize the trained ANN model to predict churn for new customers by inputting their characteristics and behaviors into the model. The model generates a churn prediction score or probability, indicating the likelihood of the customer churning. Businesses can then take appropriate actions, such as targeted retention campaigns or personalized offers, to mitigate churn risk for these customers.
Step 12: Find the importance Feature of the Model¶
Use SHAP to look for the important feature in the model to be able to offer appropriate, business profesional insight.
Step 1 - Importing necessary libraries¶
# Library to suppress warnings or deprecation notes
# To help with reading and manipulating data
import pandas as pd
import numpy as np
# To help with data visualization
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set_style('darkgrid')
# To get different metric scores, and split data
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.utils import resample
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
# To get different tensorflow data
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout,LeakyReLU, ReLU
# To supress warnings
import warnings
warnings.filterwarnings("ignore")
#checking tensorflow version
print(tf.__version__)
2.19.0
Loading the dataset¶
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
# Loading the data into my colab notebook
data = pd.read_csv('/content/drive/My Drive/UTA - AIML/Deep Learning project/Churn.csv')
Step 2: Data Analysis & Visualization¶
Data Overview¶
# checking the headings of the data
data.head()
| RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 15634602 | Hargrave | 619 | France | Female | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 |
| 1 | 2 | 15647311 | Hill | 608 | Spain | Female | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 |
| 2 | 3 | 15619304 | Onio | 502 | France | Female | 42 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 |
| 3 | 4 | 15701354 | Boni | 699 | France | Female | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 |
| 4 | 5 | 15737888 | Mitchell | 850 | Spain | Female | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 |
Observation:
The dataframe contains 14 columns
Each columns contains information about customers showing their geographical locations, gender, ages, estimated salaries among others.
# checking the structure of the data
data.shape
(10000, 14)
Observation:
- The dataframe has 14 columns and 10,000 rows
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10000 entries, 0 to 9999 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 RowNumber 10000 non-null int64 1 CustomerId 10000 non-null int64 2 Surname 10000 non-null object 3 CreditScore 10000 non-null int64 4 Geography 10000 non-null object 5 Gender 10000 non-null object 6 Age 10000 non-null int64 7 Tenure 10000 non-null int64 8 Balance 10000 non-null float64 9 NumOfProducts 10000 non-null int64 10 HasCrCard 10000 non-null int64 11 IsActiveMember 10000 non-null int64 12 EstimatedSalary 10000 non-null float64 13 Exited 10000 non-null int64 dtypes: float64(2), int64(9), object(3) memory usage: 1.1+ MB
Observation:
- The dataset have 14 columns and 10,000 observations in it
- There are no columns that have missing values
data.describe().T
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| RowNumber | 10000.0 | 5.000500e+03 | 2886.895680 | 1.00 | 2500.75 | 5.000500e+03 | 7.500250e+03 | 10000.00 |
| CustomerId | 10000.0 | 1.569094e+07 | 71936.186123 | 15565701.00 | 15628528.25 | 1.569074e+07 | 1.575323e+07 | 15815690.00 |
| CreditScore | 10000.0 | 6.505288e+02 | 96.653299 | 350.00 | 584.00 | 6.520000e+02 | 7.180000e+02 | 850.00 |
| Age | 10000.0 | 3.892180e+01 | 10.487806 | 18.00 | 32.00 | 3.700000e+01 | 4.400000e+01 | 92.00 |
| Tenure | 10000.0 | 5.012800e+00 | 2.892174 | 0.00 | 3.00 | 5.000000e+00 | 7.000000e+00 | 10.00 |
| Balance | 10000.0 | 7.648589e+04 | 62397.405202 | 0.00 | 0.00 | 9.719854e+04 | 1.276442e+05 | 250898.09 |
| NumOfProducts | 10000.0 | 1.530200e+00 | 0.581654 | 1.00 | 1.00 | 1.000000e+00 | 2.000000e+00 | 4.00 |
| HasCrCard | 10000.0 | 7.055000e-01 | 0.455840 | 0.00 | 0.00 | 1.000000e+00 | 1.000000e+00 | 1.00 |
| IsActiveMember | 10000.0 | 5.151000e-01 | 0.499797 | 0.00 | 0.00 | 1.000000e+00 | 1.000000e+00 | 1.00 |
| EstimatedSalary | 10000.0 | 1.000902e+05 | 57510.492818 | 11.58 | 51002.11 | 1.001939e+05 | 1.493882e+05 | 199992.48 |
| Exited | 10000.0 | 2.037000e-01 | 0.402769 | 0.00 | 0.00 | 0.000000e+00 | 0.000000e+00 | 1.00 |
Observations:
Age has a large range of values i.e. 18 to 92.
Credit Score is between 350 to 850
Estimated Salary highest amount is USD199,992.48cents
More indept analysis is required.
#checking if there is any duplication in the dataset
data.duplicated().sum()
np.int64(0)
Observation:
- There are no duplicated value in the dataset
data.isnull().sum()
| 0 | |
|---|---|
| RowNumber | 0 |
| CustomerId | 0 |
| Surname | 0 |
| CreditScore | 0 |
| Geography | 0 |
| Gender | 0 |
| Age | 0 |
| Tenure | 0 |
| Balance | 0 |
| NumOfProducts | 0 |
| HasCrCard | 0 |
| IsActiveMember | 0 |
| EstimatedSalary | 0 |
| Exited | 0 |
Observation:
- There are no null values in the dataset
# Showing the unique values of each columns
data.nunique()
| 0 | |
|---|---|
| RowNumber | 10000 |
| CustomerId | 10000 |
| Surname | 2932 |
| CreditScore | 460 |
| Geography | 3 |
| Gender | 2 |
| Age | 70 |
| Tenure | 11 |
| Balance | 6382 |
| NumOfProducts | 4 |
| HasCrCard | 2 |
| IsActiveMember | 2 |
| EstimatedSalary | 9999 |
| Exited | 2 |
Observation:
- Customer Age has 70 unique values, balance 6382, credit score 460, tenure 11 e.t.c.
- Only 1 of the columns is categotical items we would consider
- There are 4 columns that contains binary category values in the dataset
# drop the irrelevant columns
data.drop(columns=["RowNumber", "CustomerId", "Surname"], inplace = True)
#checking the data after dropping irrelevant columns
data.head()
| CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 619 | France | Female | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 |
| 1 | 608 | Spain | Female | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 |
| 2 | 502 | France | Female | 42 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 |
| 3 | 699 | France | Female | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 |
| 4 | 850 | Spain | Female | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 |
Observation:
- 3 columns have been dropped - Row number , Customer Id, Surname
Exploratory Data Analysis¶
- EDA is an important part of any project involving data.
- It is important to investigate and understand the data better before building a model with it.
- A few questions have been mentioned below which will help you approach the analysis in the right manner and generate insights from the data.
- A thorough analysis of the data, in addition to the questions mentioned below, should be done.
Questions:
- What is the distribution of the credit score of customers? Are there any noticeable patterns or outliers in the distribution?
- How many active members are there with the bank?
- How are the different customer attributes correlated to each other?
- Who is churning more when compared to males and females?
- Customers from which geographical part are churning more?
# function to plot stacked bar chart
def stacked_barplot(data, predictor, target):
"""
Print the category counts and plot a stacked bar chart
data: dataframe
predictor: independent variable
target: target variable
"""
count = data[predictor].nunique()
sorter = data[target].value_counts().index[-1]
tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
by=sorter, ascending=False
)
print(tab1)
print("-" * 80)
tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
by=sorter, ascending=False
)
tab.plot(kind="bar", stacked=True, figsize=(count + 1, 3))
plt.legend(
loc="lower left", frameon=False,
)
plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
plt.show()
### Function to plot distributions
def distribution_plot_wrt_target(data, predictor, target):
fig, axs = plt.subplots(2, 2, figsize=(12, 7))
target_uniq = data[target].unique()
axs[0, 0].set_title("Distribution of target for target variable=" + str(target_uniq[0]))
sns.histplot(
data=data[data[target] == target_uniq[0]],
x=predictor,
kde=True,
ax=axs[0, 0],
color="teal",
)
axs[0, 1].set_title("Distribution of target for target variable=" + str(target_uniq[1]))
sns.histplot(
data=data[data[target] == target_uniq[1]],
x=predictor,
kde=True,
ax=axs[0, 1],
color="orange",
)
axs[1, 0].set_title("Boxplot w.r.t target")
sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")
axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
sns.boxplot(
data=data,
x=target,
y=predictor,
ax=axs[1, 1],
showfliers=False,
palette="gist_rainbow",
)
plt.tight_layout()
plt.show()
Geography¶
# check the value counts of "Geography" column.
data_geography = data['Geography'].value_counts()
data_geography
| count | |
|---|---|
| Geography | |
| France | 5014 |
| Germany | 2509 |
| Spain | 2477 |
# let's perform univarient eda on "Geography" column.
plt.figure(figsize=(10,6))
res=sns.barplot(x=data_geography, y=data_geography.index)
res.set_yticklabels(res.get_ymajorticklabels(), fontsize = 16, color='black')
plt.title('Value Counts of Customers By Geographical Location',fontsize = 16, color='black')
plt.ylabel('Geographic Locations',fontsize = 16, color='black')
plt.xlabel('Counts', fontsize =16, color= 'black')
plt.show()
Observation:
- Most of the Customers are from France, half of the customers are from this country which is 5,014
- The remaining customers are from Germany and Spain, 2,509 and 2,477 respectively
Gender¶
# check the value counts of "Gender" column.
data_gender = data['Gender'].value_counts()
data_gender
| count | |
|---|---|
| Gender | |
| Male | 5457 |
| Female | 4543 |
# let's perform univarient eda on "Gender" column.
plt.figure(figsize=(8,6))
res=sns.barplot(x=data_gender, y=data_gender.index)
res.set_yticklabels(res.get_ymajorticklabels(), fontsize = 16, color='black')
plt.title('Value Counts of Customers By Gender',fontsize = 16, color='black')
plt.ylabel('Gender',fontsize = 16, color='black')
plt.xlabel('Counts', fontsize =16, color= 'black')
plt.show()
Observation:
- The distribution between the gender is close. Male customers are more, but not much different though.
- Male is 5,457 and Female is 4,543
- It is very interesting that there are more male customers
Number of Products¶
# check the value counts of "number of products" column.
data_nop = data['NumOfProducts'].value_counts()
data_nop
| count | |
|---|---|
| NumOfProducts | |
| 1 | 5084 |
| 2 | 4590 |
| 3 | 266 |
| 4 | 60 |
# let's perform univarient eda on "number of products" column.
plt.figure(figsize=(10,6))
plt.title('Value Counts of No of Products Used By The Customers',fontsize = 16, color='black')
plt.ylabel('Counts',fontsize = 16, color='black')
plt.xlabel('No of Products', fontsize =16, color= 'black')
res=sns.barplot(y=data_nop, x=data_nop.index)
res.set_yticklabels(res.get_ymajorticklabels(), fontsize = 16, color='black')
plt.show();
Observation:
- There are 4 number of products subscribed to by the customers
- 5,084 customers have subscribed to product 1
- 4,590 customers subscribed to product 2
- 326 customers subcribed to products 3 and 4
Has Credit Card¶
# check the value counts of "HasCrCard" column.
data_crc = data['HasCrCard'].value_counts()
data_crc
| count | |
|---|---|
| HasCrCard | |
| 1 | 7055 |
| 0 | 2945 |
# let's perform univarient eda on "HasCrCard" column.
plt.figure(figsize=(8,6))
res=sns.barplot(y=data_crc, x=data_crc.index)
res.set_yticklabels(res.get_ymajorticklabels(), fontsize = 16, color='black')
plt.ylabel('Counts of if Customers Have CC or Not',fontsize = 16, color='black')
plt.xlabel('Customer - Have CC or Not',fontsize = 16, color='black')
plt.title('Customers Who have Credit Cards or Not',fontsize = 16, color='black')
plt.show()
Observation:
- More customers has credit cards - 7,055
- While 29.45% of the customers did not have credit cards
Active or Not Active Customers¶
# check the value counts of "IsActiveMember" column.
data_iam = data['IsActiveMember'].value_counts()
data_iam
| count | |
|---|---|
| IsActiveMember | |
| 1 | 5151 |
| 0 | 4849 |
# let's perform univarient eda on "IsActiveMember" column.
plt.figure(figsize=(8,4))
res=sns.barplot(y=data_iam, x=data_iam.index)
res.set_yticklabels(res.get_ymajorticklabels(), fontsize = 16, color='black')
plt.ylabel('Value Counts of Customer Status',fontsize = 16, color='black')
plt.xlabel('Active or Not Active',fontsize = 16, color='black')
plt.title('Status of Customers - Active or Not Active',fontsize = 16, color='black')
plt.show()
Observation:
- 52% of the customers maintained active accounts with the bank while 48% had inactive accounts
Churned/Not Churned¶
# check the value counts of "Exited" target column.
data_exit = data['Exited'].value_counts()
data_exit
| count | |
|---|---|
| Exited | |
| 0 | 7963 |
| 1 | 2037 |
# # let's perform univarient eda on "Exited" target column.
plt.figure(figsize=(8,4))
res=sns.barplot(y=data_exit, x=data_exit.index)
res.set_yticklabels(res.get_ymajorticklabels(), fontsize = 16, color='black')
plt.ylabel('No of Customers Who Churn or Not',fontsize = 16, color='black')
plt.xlabel('Churn/Not Churn',fontsize = 16, color='black')
plt.title('Customers Who Churned or Not',fontsize = 18, color='black')
plt.show()
Observation:
- 80% of the customers churned the bank.
- This is the area our neural network is goinf to focus on.
Age¶
distribution_plot_wrt_target(data, "Age", "Exited");
Observation:
- The mode age is aroud 46. Most of the customers are young men around the age of 46
- The Average age is arounf 36 years old, and the minimum age is around 18 yeras
- There is outliers in the ages of the customers because some customer between the ages of 58 and 92 are present in the dataset
- When the outliers in the age variable is treated, the mode age fell back to 38 years old.
- The outliers in the age variable will not have any impact on our analysis because it is usually a natural course of life for ages to vary accordingly.
Balance¶
distribution_plot_wrt_target(data, "Balance", "Exited");
Observation
- The Balance variable of the dataset show that most of the customers of the bank maintianed zero dollar for the period covered by the analysis.
- This may be as a result of taking a cut-off period at the wrong time when the information were collated, or probably most of the bank accounts used in the analysis were salary accounts for workers and the information were collected after they may have withdrawn their money from their account or not. More question is needed to be asked in order to more fully understand why this is so.
- The second most maintained account balance amount is around $125000. We need to find out why there is so much gap.
- This independent variable is very important to our analysis.
Tenure¶
stacked_barplot(data, "Tenure", "Exited");
Exited 0 1 All Tenure All 7963 2037 10000 1 803 232 1035 9 771 213 984 3 796 213 1009 5 803 209 1012 4 786 203 989 2 847 201 1048 8 828 197 1025 6 771 196 967 7 851 177 1028 10 389 101 490 0 318 95 413 --------------------------------------------------------------------------------
Observation
- Tenure can be described as the time period usually 12 months between the date of disbursal of a loan amount or credit card and so on, and the payment date of the last EMI
- in the analysis, there are 11 tenures in the variable
- The tunure of 2 years is the most commonly used tenure by the customers followed by tenure of 1 year.
- It remains to be seen how this variable will affect our study.
Credit Score¶
distribution_plot_wrt_target(data, "CreditScore", "Exited");
Observation
- The most credit score recorded and observed in the dataset is 650
- 650 is also the average credit score in the dataset
- But few customers have very low credit score, those that are regarded as very POOR credit score - Below 400 score
Estimated Salary¶
distribution_plot_wrt_target(data, "EstimatedSalary", "Exited");
Observation
- Most of the Customers have estimated salsry of USD175000 followed by USD125000
- The average estimated salary is USD100000
- The percentage of customers whose estimated salary is above USD50000 constitute over 80% of the customers in the dataset
- These variable show that most of the banks customers are middle class income earners and the bank should take advantage of this income bracket people.
- This varaible wil surely have impact of whether a customer will churn or not churn. So it will be become more evident as this analysis is continued.
Pairplot of the dataset¶
# pairplot using Attrition flag, the dependent variable as emphasis
plt.figure(figsize=(15,7))
sns.pairplot(data, hue="Exited")
plt.show()
plt.savefig("output.jpg", bbox_inches='tight');
<Figure size 1500x700 with 0 Axes>
<Figure size 640x480 with 0 Axes>
Observation:
There are overlaps i.e. mostly no clear distinction in the distribution of most of the variables for people who have Churned and did not Churn.
- No clear distinction in the distribution of variables for people who have churned and did not churn as the colorations of the pairplot are widely inter-mingled.
- However, the following have lower amounts of churning customers; * Active or Not Active * Has Credit Card or Not * Number of Products * Tenure
- The Following Variables showed evidence of contributing to if a customer will churn or not; * Balance * Estimated Salary * Age * Credit Score
This will guide my approacha and how to focus on my analysis.
Steps 3 to 11 are combined¶
Data Preprocessing¶
- Missing value treatment
- Feature engineering (if needed)
- Outlier detection and treatment (if needed)
- Preparing data for modeling
- Any other preprocessing steps (if needed)
Since our dataset contains categorical variables like “Gender” and “Geography,” we need to convert them into numerical format for the ANN model to process. We apply the one-hot encoding technique to transform these categorical columns into binary columns, representing different categories.
# use pandas dummies funtion for one hot encodeing
data = pd.get_dummies(data, columns=["Gender", "Geography"], drop_first=True)
#check the dataframe after the one hot encoding method is applied
data.head()
| CreditScore | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | Gender_Male | Geography_Germany | Geography_Spain | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 619 | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 | False | False | False |
| 1 | 608 | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 | False | False | True |
| 2 | 502 | 42 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 | False | False | False |
| 3 | 699 | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 | False | False | False |
| 4 | 850 | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 | False | False | True |
data.sample(10)
| CreditScore | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | Gender_Male | Geography_Germany | Geography_Spain | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9799 | 653 | 46 | 0 | 119556.10 | 1 | 1 | 0 | 78250.13 | 1 | True | False | False |
| 3576 | 647 | 28 | 8 | 0.00 | 2 | 1 | 1 | 91055.27 | 0 | True | False | False |
| 4235 | 595 | 29 | 6 | 150685.79 | 1 | 1 | 0 | 87771.06 | 0 | True | False | False |
| 5221 | 584 | 29 | 4 | 0.00 | 2 | 1 | 0 | 88866.92 | 0 | False | False | True |
| 2489 | 714 | 28 | 6 | 122724.37 | 1 | 1 | 1 | 67057.27 | 0 | True | False | False |
| 1308 | 710 | 33 | 10 | 118327.17 | 2 | 1 | 1 | 192928.82 | 0 | True | False | False |
| 7406 | 489 | 58 | 4 | 0.00 | 2 | 1 | 1 | 191419.32 | 0 | True | False | True |
| 3371 | 742 | 43 | 6 | 97067.69 | 1 | 0 | 1 | 60920.03 | 1 | False | True | False |
| 9717 | 757 | 30 | 3 | 145396.49 | 1 | 0 | 1 | 198341.15 | 0 | True | False | True |
| 7073 | 679 | 40 | 1 | 0.00 | 1 | 1 | 1 | 16897.19 | 0 | True | False | False |
.
Model Building Approach
Before training our ANN model, we need to preprocess the data by separating the features and the target variable. we split the dataset into training and testing sets. The training set is used to train the model on historical data, while the testing set is used to evaluate the model’s performance on unseen data. Typically, around 70–80% of the data is used for training, and the remaining 20–30% is used for testing.
Model Building¶
Model Building: Artificial Neural Network¶
#Starting the model building
from sklearn.model_selection import train_test_split
X = data.drop(columns=['Exited'])
y = data['Exited'].values
X_train_ann,X_test_ann,y_train_ann,y_test_ann=train_test_split(X,y,test_size=0.2,random_state=1)
# check the shape of X_train & X_test, y_train & y_test
print("X_train Shape : ", X_train_ann.shape)
print("X_test Shape : ", X_test_ann.shape)
print("y_train Shape : ", y_train_ann.shape)
print("y_test Shape : ", y_test_ann.shape)
X_train Shape : (8000, 11) X_test Shape : (2000, 11) y_train Shape : (8000,) y_test Shape : (2000,)
# use sklearn standard scaler technique for standarized the input features values
scaler = StandardScaler()
X_train_ann = scaler.fit_transform(X_train_ann)
X_test_ann = scaler.transform(X_test_ann)
# check the X_train_scaled values
X_train_ann
array([[-0.23082038, -0.94449979, -0.70174202, ..., 0.91509065,
1.71490137, -0.57273139],
[-0.25150912, -0.94449979, -0.35520275, ..., -1.09278791,
-0.58312392, -0.57273139],
[-0.3963303 , 0.77498705, 0.33787579, ..., -1.09278791,
1.71490137, -0.57273139],
...,
[ 0.22433188, 0.58393295, 1.3774936 , ..., -1.09278791,
-0.58312392, -0.57273139],
[ 0.13123255, 0.01077067, 1.03095433, ..., -1.09278791,
-0.58312392, -0.57273139],
[ 1.1656695 , 0.29735181, 0.33787579, ..., 0.91509065,
1.71490137, -0.57273139]])
# check the X_est_scaled values
X_test_ann
array([[-1.03768121, 0.77498705, -1.0482813 , ..., 0.91509065,
-0.58312392, -0.57273139],
[ 0.30708683, -0.46686456, -0.70174202, ..., 0.91509065,
-0.58312392, -0.57273139],
[-1.23422423, 0.29735181, -1.0482813 , ..., -1.09278791,
-0.58312392, -0.57273139],
...,
[-0.86182692, -0.46686456, 1.72403288, ..., 0.91509065,
-0.58312392, 1.74601919],
[-0.30323097, -0.84897275, -1.0482813 , ..., -1.09278791,
1.71490137, -0.57273139],
[ 0.04847759, 1.25262228, 1.3774936 , ..., 0.91509065,
1.71490137, -0.57273139]])
model_ann = Sequential()
model_ann.add(Dense(11, activation='sigmoid', input_dim = 11))
model_ann.add(Dense(11, activation='sigmoid'))
model_ann.add(Dense(1, activation='sigmoid'))
model_ann.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 1) │ 12 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 276 (1.08 KB)
Trainable params: 276 (1.08 KB)
Non-trainable params: 0 (0.00 B)
model_ann.compile(loss='binary_crossentropy', metrics=['accuracy'])
history_ann = model_ann.fit(X_train_ann, y_train_ann, epochs = 100, validation_split=0.2)
Epoch 1/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.3298 - loss: 0.8753 - val_accuracy: 0.7975 - val_loss: 0.5076 Epoch 2/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8022 - loss: 0.4908 - val_accuracy: 0.7975 - val_loss: 0.4820 Epoch 3/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.7905 - loss: 0.4854 - val_accuracy: 0.7975 - val_loss: 0.4672 Epoch 4/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8045 - loss: 0.4518 - val_accuracy: 0.7981 - val_loss: 0.4540 Epoch 5/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.7958 - loss: 0.4512 - val_accuracy: 0.8031 - val_loss: 0.4443 Epoch 6/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8084 - loss: 0.4288 - val_accuracy: 0.8138 - val_loss: 0.4387 Epoch 7/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8116 - loss: 0.4322 - val_accuracy: 0.8138 - val_loss: 0.4347 Epoch 8/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8164 - loss: 0.4272 - val_accuracy: 0.8106 - val_loss: 0.4324 Epoch 9/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8238 - loss: 0.4174 - val_accuracy: 0.8125 - val_loss: 0.4307 Epoch 10/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8185 - loss: 0.4183 - val_accuracy: 0.8156 - val_loss: 0.4285 Epoch 11/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8244 - loss: 0.4193 - val_accuracy: 0.8188 - val_loss: 0.4265 Epoch 12/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8203 - loss: 0.4199 - val_accuracy: 0.8200 - val_loss: 0.4257 Epoch 13/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8372 - loss: 0.3958 - val_accuracy: 0.8206 - val_loss: 0.4223 Epoch 14/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8339 - loss: 0.4028 - val_accuracy: 0.8231 - val_loss: 0.4200 Epoch 15/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8222 - loss: 0.4286 - val_accuracy: 0.8238 - val_loss: 0.4183 Epoch 16/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8388 - loss: 0.3946 - val_accuracy: 0.8288 - val_loss: 0.4155 Epoch 17/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8297 - loss: 0.4138 - val_accuracy: 0.8288 - val_loss: 0.4131 Epoch 18/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8283 - loss: 0.4139 - val_accuracy: 0.8281 - val_loss: 0.4105 Epoch 19/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8386 - loss: 0.3965 - val_accuracy: 0.8300 - val_loss: 0.4077 Epoch 20/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8334 - loss: 0.4005 - val_accuracy: 0.8288 - val_loss: 0.4075 Epoch 21/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8406 - loss: 0.3954 - val_accuracy: 0.8319 - val_loss: 0.4029 Epoch 22/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8332 - loss: 0.4063 - val_accuracy: 0.8325 - val_loss: 0.4010 Epoch 23/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8369 - loss: 0.3968 - val_accuracy: 0.8350 - val_loss: 0.3990 Epoch 24/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8456 - loss: 0.3852 - val_accuracy: 0.8325 - val_loss: 0.3960 Epoch 25/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8416 - loss: 0.3867 - val_accuracy: 0.8325 - val_loss: 0.3937 Epoch 26/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8475 - loss: 0.3776 - val_accuracy: 0.8313 - val_loss: 0.3914 Epoch 27/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8430 - loss: 0.3876 - val_accuracy: 0.8356 - val_loss: 0.3893 Epoch 28/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8414 - loss: 0.3868 - val_accuracy: 0.8356 - val_loss: 0.3875 Epoch 29/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8441 - loss: 0.3836 - val_accuracy: 0.8344 - val_loss: 0.3852 Epoch 30/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8433 - loss: 0.3830 - val_accuracy: 0.8338 - val_loss: 0.3834 Epoch 31/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8505 - loss: 0.3723 - val_accuracy: 0.8369 - val_loss: 0.3819 Epoch 32/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8540 - loss: 0.3655 - val_accuracy: 0.8350 - val_loss: 0.3796 Epoch 33/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8504 - loss: 0.3729 - val_accuracy: 0.8381 - val_loss: 0.3775 Epoch 34/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8492 - loss: 0.3680 - val_accuracy: 0.8388 - val_loss: 0.3760 Epoch 35/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8464 - loss: 0.3763 - val_accuracy: 0.8381 - val_loss: 0.3744 Epoch 36/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8441 - loss: 0.3828 - val_accuracy: 0.8388 - val_loss: 0.3725 Epoch 37/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8456 - loss: 0.3681 - val_accuracy: 0.8388 - val_loss: 0.3725 Epoch 38/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8579 - loss: 0.3620 - val_accuracy: 0.8400 - val_loss: 0.3689 Epoch 39/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8569 - loss: 0.3598 - val_accuracy: 0.8406 - val_loss: 0.3681 Epoch 40/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8468 - loss: 0.3713 - val_accuracy: 0.8419 - val_loss: 0.3671 Epoch 41/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8580 - loss: 0.3488 - val_accuracy: 0.8419 - val_loss: 0.3652 Epoch 42/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8527 - loss: 0.3656 - val_accuracy: 0.8413 - val_loss: 0.3642 Epoch 43/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8524 - loss: 0.3567 - val_accuracy: 0.8425 - val_loss: 0.3624 Epoch 44/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8511 - loss: 0.3628 - val_accuracy: 0.8425 - val_loss: 0.3615 Epoch 45/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8605 - loss: 0.3495 - val_accuracy: 0.8431 - val_loss: 0.3604 Epoch 46/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8599 - loss: 0.3431 - val_accuracy: 0.8438 - val_loss: 0.3594 Epoch 47/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8565 - loss: 0.3456 - val_accuracy: 0.8450 - val_loss: 0.3585 Epoch 48/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8581 - loss: 0.3534 - val_accuracy: 0.8456 - val_loss: 0.3577 Epoch 49/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8573 - loss: 0.3509 - val_accuracy: 0.8469 - val_loss: 0.3561 Epoch 50/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8541 - loss: 0.3528 - val_accuracy: 0.8475 - val_loss: 0.3551 Epoch 51/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8685 - loss: 0.3382 - val_accuracy: 0.8500 - val_loss: 0.3546 Epoch 52/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8630 - loss: 0.3474 - val_accuracy: 0.8494 - val_loss: 0.3534 Epoch 53/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8601 - loss: 0.3455 - val_accuracy: 0.8500 - val_loss: 0.3527 Epoch 54/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8544 - loss: 0.3572 - val_accuracy: 0.8519 - val_loss: 0.3522 Epoch 55/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8637 - loss: 0.3451 - val_accuracy: 0.8537 - val_loss: 0.3506 Epoch 56/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8580 - loss: 0.3510 - val_accuracy: 0.8512 - val_loss: 0.3526 Epoch 57/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8575 - loss: 0.3462 - val_accuracy: 0.8525 - val_loss: 0.3509 Epoch 58/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8560 - loss: 0.3513 - val_accuracy: 0.8519 - val_loss: 0.3499 Epoch 59/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8623 - loss: 0.3456 - val_accuracy: 0.8537 - val_loss: 0.3499 Epoch 60/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8673 - loss: 0.3348 - val_accuracy: 0.8562 - val_loss: 0.3478 Epoch 61/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8622 - loss: 0.3419 - val_accuracy: 0.8519 - val_loss: 0.3506 Epoch 62/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8528 - loss: 0.3541 - val_accuracy: 0.8531 - val_loss: 0.3482 Epoch 63/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8662 - loss: 0.3336 - val_accuracy: 0.8550 - val_loss: 0.3469 Epoch 64/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8602 - loss: 0.3406 - val_accuracy: 0.8550 - val_loss: 0.3465 Epoch 65/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8596 - loss: 0.3415 - val_accuracy: 0.8556 - val_loss: 0.3460 Epoch 66/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8658 - loss: 0.3356 - val_accuracy: 0.8537 - val_loss: 0.3469 Epoch 67/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8576 - loss: 0.3491 - val_accuracy: 0.8537 - val_loss: 0.3453 Epoch 68/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8660 - loss: 0.3335 - val_accuracy: 0.8531 - val_loss: 0.3484 Epoch 69/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8577 - loss: 0.3372 - val_accuracy: 0.8550 - val_loss: 0.3449 Epoch 70/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8645 - loss: 0.3340 - val_accuracy: 0.8537 - val_loss: 0.3450 Epoch 71/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8653 - loss: 0.3336 - val_accuracy: 0.8556 - val_loss: 0.3443 Epoch 72/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8653 - loss: 0.3401 - val_accuracy: 0.8562 - val_loss: 0.3437 Epoch 73/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8635 - loss: 0.3339 - val_accuracy: 0.8575 - val_loss: 0.3430 Epoch 74/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8706 - loss: 0.3235 - val_accuracy: 0.8556 - val_loss: 0.3450 Epoch 75/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8638 - loss: 0.3288 - val_accuracy: 0.8569 - val_loss: 0.3427 Epoch 76/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8692 - loss: 0.3289 - val_accuracy: 0.8569 - val_loss: 0.3419 Epoch 77/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8604 - loss: 0.3359 - val_accuracy: 0.8556 - val_loss: 0.3444 Epoch 78/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8617 - loss: 0.3358 - val_accuracy: 0.8569 - val_loss: 0.3429 Epoch 79/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8659 - loss: 0.3336 - val_accuracy: 0.8587 - val_loss: 0.3418 Epoch 80/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8577 - loss: 0.3391 - val_accuracy: 0.8556 - val_loss: 0.3450 Epoch 81/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8686 - loss: 0.3243 - val_accuracy: 0.8537 - val_loss: 0.3428 Epoch 82/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8621 - loss: 0.3377 - val_accuracy: 0.8569 - val_loss: 0.3441 Epoch 83/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8609 - loss: 0.3269 - val_accuracy: 0.8556 - val_loss: 0.3420 Epoch 84/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8615 - loss: 0.3405 - val_accuracy: 0.8562 - val_loss: 0.3416 Epoch 85/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8648 - loss: 0.3365 - val_accuracy: 0.8587 - val_loss: 0.3405 Epoch 86/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8624 - loss: 0.3279 - val_accuracy: 0.8556 - val_loss: 0.3422 Epoch 87/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8624 - loss: 0.3287 - val_accuracy: 0.8556 - val_loss: 0.3422 Epoch 88/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8586 - loss: 0.3416 - val_accuracy: 0.8562 - val_loss: 0.3416 Epoch 89/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8591 - loss: 0.3376 - val_accuracy: 0.8562 - val_loss: 0.3409 Epoch 90/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8591 - loss: 0.3375 - val_accuracy: 0.8544 - val_loss: 0.3430 Epoch 91/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8706 - loss: 0.3250 - val_accuracy: 0.8562 - val_loss: 0.3419 Epoch 92/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8622 - loss: 0.3345 - val_accuracy: 0.8569 - val_loss: 0.3412 Epoch 93/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8637 - loss: 0.3360 - val_accuracy: 0.8562 - val_loss: 0.3424 Epoch 94/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8607 - loss: 0.3370 - val_accuracy: 0.8575 - val_loss: 0.3408 Epoch 95/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8609 - loss: 0.3393 - val_accuracy: 0.8562 - val_loss: 0.3406 Epoch 96/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8702 - loss: 0.3265 - val_accuracy: 0.8581 - val_loss: 0.3401 Epoch 97/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8582 - loss: 0.3397 - val_accuracy: 0.8575 - val_loss: 0.3405 Epoch 98/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8637 - loss: 0.3262 - val_accuracy: 0.8594 - val_loss: 0.3403 Epoch 99/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8657 - loss: 0.3240 - val_accuracy: 0.8575 - val_loss: 0.3407 Epoch 100/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8692 - loss: 0.3235 - val_accuracy: 0.8575 - val_loss: 0.3402
#checking the loss visualization
plt.title('Visualization of Loss and Validation Loss of ANN Performance')
plt.plot(history_ann.history['loss'])
plt.plot(history_ann.history['val_loss']);
#checking the accuracy visualization
plt.title('Visualization of Accuracy and Validation Accuracy of ANN Performance')
plt.plot(history_ann.history['accuracy'])
plt.plot(history_ann.history['val_accuracy']);
y_log1 = model_ann.predict(X_test_ann)
63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step
y_pred_ann = np.where(y_log1 > 0.5, 1, 0)
accuracy_score(y_test_ann, y_pred_ann)
0.8605
print(classification_report(y_test_ann, y_pred_ann))
precision recall f1-score support
0 0.88 0.96 0.92 1585
1 0.76 0.48 0.59 415
accuracy 0.86 2000
macro avg 0.82 0.72 0.75 2000
weighted avg 0.85 0.86 0.85 2000
Observation:
The model generates a churn prediction score or probability of 86.60%, indicating the likelihood of the customer churning.
Model Building: Neural Network model with Adam Optimizer¶
X_train_adm,X_test_adm,y_train_adm,y_test_adm=train_test_split(X,y,test_size=0.2,random_state=1)
# check the shape of X_train & X_test, y_train & y_test
print("X_train Shape : ", X_train_adm.shape)
print("X_test Shape : ", X_test_adm.shape)
print("y_train Shape : ", y_train_adm.shape)
print("y_test Shape : ", y_test_adm.shape)
X_train Shape : (8000, 11) X_test Shape : (2000, 11) y_train Shape : (8000,) y_test Shape : (2000,)
# use sklearn standard scaler technique for standarized the input features values
scaler = StandardScaler()
X_train_adm = scaler.fit_transform(X_train_adm)
X_test_adm = scaler.transform(X_test_adm)
# check the X_train_scaled values
X_train_adm
array([[-0.23082038, -0.94449979, -0.70174202, ..., 0.91509065,
1.71490137, -0.57273139],
[-0.25150912, -0.94449979, -0.35520275, ..., -1.09278791,
-0.58312392, -0.57273139],
[-0.3963303 , 0.77498705, 0.33787579, ..., -1.09278791,
1.71490137, -0.57273139],
...,
[ 0.22433188, 0.58393295, 1.3774936 , ..., -1.09278791,
-0.58312392, -0.57273139],
[ 0.13123255, 0.01077067, 1.03095433, ..., -1.09278791,
-0.58312392, -0.57273139],
[ 1.1656695 , 0.29735181, 0.33787579, ..., 0.91509065,
1.71490137, -0.57273139]])
# check the X_train_scaled values
X_test_adm
array([[-1.03768121, 0.77498705, -1.0482813 , ..., 0.91509065,
-0.58312392, -0.57273139],
[ 0.30708683, -0.46686456, -0.70174202, ..., 0.91509065,
-0.58312392, -0.57273139],
[-1.23422423, 0.29735181, -1.0482813 , ..., -1.09278791,
-0.58312392, -0.57273139],
...,
[-0.86182692, -0.46686456, 1.72403288, ..., 0.91509065,
-0.58312392, 1.74601919],
[-0.30323097, -0.84897275, -1.0482813 , ..., -1.09278791,
1.71490137, -0.57273139],
[ 0.04847759, 1.25262228, 1.3774936 , ..., 0.91509065,
1.71490137, -0.57273139]])
model_adm = Sequential()
model_adm.add(Dense(11, activation='relu', input_dim = 11))
model_adm.add(Dense(11, activation='relu'))
model_adm.add(Dense(1, activation='sigmoid'))
model_adm.summary()
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense_3 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_4 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_5 (Dense) │ (None, 1) │ 12 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 276 (1.08 KB)
Trainable params: 276 (1.08 KB)
Non-trainable params: 0 (0.00 B)
model_adm.compile(loss='binary_crossentropy', optimizer='adam',metrics=['accuracy'])
history_adm =model_adm.fit(X_train_adm,y_train_adm,epochs=100,validation_split=0.2)
Epoch 1/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.6306 - loss: 0.6669 - val_accuracy: 0.7987 - val_loss: 0.5039 Epoch 2/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8050 - loss: 0.4840 - val_accuracy: 0.8100 - val_loss: 0.4544 Epoch 3/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8237 - loss: 0.4344 - val_accuracy: 0.8131 - val_loss: 0.4390 Epoch 4/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8237 - loss: 0.4280 - val_accuracy: 0.8175 - val_loss: 0.4311 Epoch 5/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8285 - loss: 0.4215 - val_accuracy: 0.8169 - val_loss: 0.4243 Epoch 6/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8248 - loss: 0.4161 - val_accuracy: 0.8188 - val_loss: 0.4208 Epoch 7/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8343 - loss: 0.4090 - val_accuracy: 0.8206 - val_loss: 0.4178 Epoch 8/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8340 - loss: 0.4057 - val_accuracy: 0.8231 - val_loss: 0.4154 Epoch 9/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8351 - loss: 0.4030 - val_accuracy: 0.8219 - val_loss: 0.4139 Epoch 10/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8239 - loss: 0.4203 - val_accuracy: 0.8200 - val_loss: 0.4106 Epoch 11/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8356 - loss: 0.4060 - val_accuracy: 0.8238 - val_loss: 0.4078 Epoch 12/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8414 - loss: 0.3929 - val_accuracy: 0.8281 - val_loss: 0.4023 Epoch 13/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8455 - loss: 0.3817 - val_accuracy: 0.8338 - val_loss: 0.3962 Epoch 14/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8487 - loss: 0.3724 - val_accuracy: 0.8313 - val_loss: 0.3875 Epoch 15/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8583 - loss: 0.3712 - val_accuracy: 0.8375 - val_loss: 0.3806 Epoch 16/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8502 - loss: 0.3679 - val_accuracy: 0.8406 - val_loss: 0.3746 Epoch 17/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8523 - loss: 0.3663 - val_accuracy: 0.8462 - val_loss: 0.3702 Epoch 18/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8592 - loss: 0.3512 - val_accuracy: 0.8494 - val_loss: 0.3656 Epoch 19/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8608 - loss: 0.3524 - val_accuracy: 0.8506 - val_loss: 0.3632 Epoch 20/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8632 - loss: 0.3464 - val_accuracy: 0.8506 - val_loss: 0.3620 Epoch 21/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8591 - loss: 0.3467 - val_accuracy: 0.8512 - val_loss: 0.3595 Epoch 22/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8595 - loss: 0.3494 - val_accuracy: 0.8487 - val_loss: 0.3589 Epoch 23/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8664 - loss: 0.3332 - val_accuracy: 0.8500 - val_loss: 0.3579 Epoch 24/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8661 - loss: 0.3379 - val_accuracy: 0.8475 - val_loss: 0.3582 Epoch 25/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8621 - loss: 0.3429 - val_accuracy: 0.8469 - val_loss: 0.3577 Epoch 26/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8642 - loss: 0.3348 - val_accuracy: 0.8500 - val_loss: 0.3559 Epoch 27/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8662 - loss: 0.3332 - val_accuracy: 0.8512 - val_loss: 0.3553 Epoch 28/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8609 - loss: 0.3476 - val_accuracy: 0.8469 - val_loss: 0.3565 Epoch 29/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8651 - loss: 0.3349 - val_accuracy: 0.8519 - val_loss: 0.3550 Epoch 30/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8680 - loss: 0.3347 - val_accuracy: 0.8512 - val_loss: 0.3549 Epoch 31/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8702 - loss: 0.3246 - val_accuracy: 0.8519 - val_loss: 0.3545 Epoch 32/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8602 - loss: 0.3396 - val_accuracy: 0.8475 - val_loss: 0.3552 Epoch 33/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8628 - loss: 0.3332 - val_accuracy: 0.8481 - val_loss: 0.3548 Epoch 34/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8694 - loss: 0.3188 - val_accuracy: 0.8506 - val_loss: 0.3549 Epoch 35/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8643 - loss: 0.3348 - val_accuracy: 0.8525 - val_loss: 0.3537 Epoch 36/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8713 - loss: 0.3191 - val_accuracy: 0.8500 - val_loss: 0.3543 Epoch 37/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8672 - loss: 0.3254 - val_accuracy: 0.8506 - val_loss: 0.3550 Epoch 38/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8596 - loss: 0.3379 - val_accuracy: 0.8494 - val_loss: 0.3536 Epoch 39/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8705 - loss: 0.3301 - val_accuracy: 0.8506 - val_loss: 0.3543 Epoch 40/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8716 - loss: 0.3173 - val_accuracy: 0.8525 - val_loss: 0.3530 Epoch 41/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8597 - loss: 0.3399 - val_accuracy: 0.8481 - val_loss: 0.3543 Epoch 42/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8637 - loss: 0.3314 - val_accuracy: 0.8519 - val_loss: 0.3530 Epoch 43/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8655 - loss: 0.3331 - val_accuracy: 0.8531 - val_loss: 0.3526 Epoch 44/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8639 - loss: 0.3285 - val_accuracy: 0.8506 - val_loss: 0.3529 Epoch 45/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8658 - loss: 0.3315 - val_accuracy: 0.8487 - val_loss: 0.3538 Epoch 46/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8819 - loss: 0.3059 - val_accuracy: 0.8519 - val_loss: 0.3539 Epoch 47/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8634 - loss: 0.3238 - val_accuracy: 0.8512 - val_loss: 0.3538 Epoch 48/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8709 - loss: 0.3186 - val_accuracy: 0.8494 - val_loss: 0.3534 Epoch 49/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8710 - loss: 0.3254 - val_accuracy: 0.8487 - val_loss: 0.3527 Epoch 50/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8695 - loss: 0.3285 - val_accuracy: 0.8506 - val_loss: 0.3541 Epoch 51/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8654 - loss: 0.3349 - val_accuracy: 0.8481 - val_loss: 0.3538 Epoch 52/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8718 - loss: 0.3265 - val_accuracy: 0.8531 - val_loss: 0.3526 Epoch 53/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8712 - loss: 0.3211 - val_accuracy: 0.8487 - val_loss: 0.3533 Epoch 54/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8652 - loss: 0.3294 - val_accuracy: 0.8506 - val_loss: 0.3523 Epoch 55/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8673 - loss: 0.3257 - val_accuracy: 0.8487 - val_loss: 0.3543 Epoch 56/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8708 - loss: 0.3229 - val_accuracy: 0.8512 - val_loss: 0.3533 Epoch 57/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8693 - loss: 0.3179 - val_accuracy: 0.8469 - val_loss: 0.3526 Epoch 58/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8695 - loss: 0.3283 - val_accuracy: 0.8481 - val_loss: 0.3535 Epoch 59/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8690 - loss: 0.3301 - val_accuracy: 0.8506 - val_loss: 0.3532 Epoch 60/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8662 - loss: 0.3286 - val_accuracy: 0.8487 - val_loss: 0.3532 Epoch 61/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8721 - loss: 0.3217 - val_accuracy: 0.8487 - val_loss: 0.3534 Epoch 62/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8729 - loss: 0.3228 - val_accuracy: 0.8469 - val_loss: 0.3556 Epoch 63/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8683 - loss: 0.3269 - val_accuracy: 0.8475 - val_loss: 0.3560 Epoch 64/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8736 - loss: 0.3202 - val_accuracy: 0.8487 - val_loss: 0.3531 Epoch 65/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8674 - loss: 0.3280 - val_accuracy: 0.8481 - val_loss: 0.3545 Epoch 66/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8713 - loss: 0.3173 - val_accuracy: 0.8475 - val_loss: 0.3537 Epoch 67/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8731 - loss: 0.3279 - val_accuracy: 0.8487 - val_loss: 0.3536 Epoch 68/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8660 - loss: 0.3277 - val_accuracy: 0.8487 - val_loss: 0.3533 Epoch 69/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8637 - loss: 0.3331 - val_accuracy: 0.8494 - val_loss: 0.3537 Epoch 70/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8667 - loss: 0.3330 - val_accuracy: 0.8481 - val_loss: 0.3540 Epoch 71/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8736 - loss: 0.3182 - val_accuracy: 0.8487 - val_loss: 0.3520 Epoch 72/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8692 - loss: 0.3228 - val_accuracy: 0.8512 - val_loss: 0.3531 Epoch 73/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8628 - loss: 0.3369 - val_accuracy: 0.8481 - val_loss: 0.3547 Epoch 74/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8719 - loss: 0.3176 - val_accuracy: 0.8475 - val_loss: 0.3537 Epoch 75/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8610 - loss: 0.3351 - val_accuracy: 0.8456 - val_loss: 0.3535 Epoch 76/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8695 - loss: 0.3198 - val_accuracy: 0.8481 - val_loss: 0.3529 Epoch 77/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8643 - loss: 0.3298 - val_accuracy: 0.8450 - val_loss: 0.3531 Epoch 78/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8672 - loss: 0.3247 - val_accuracy: 0.8456 - val_loss: 0.3525 Epoch 79/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8758 - loss: 0.3139 - val_accuracy: 0.8469 - val_loss: 0.3529 Epoch 80/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8711 - loss: 0.3151 - val_accuracy: 0.8506 - val_loss: 0.3528 Epoch 81/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8747 - loss: 0.3204 - val_accuracy: 0.8450 - val_loss: 0.3543 Epoch 82/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8676 - loss: 0.3240 - val_accuracy: 0.8456 - val_loss: 0.3550 Epoch 83/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8680 - loss: 0.3244 - val_accuracy: 0.8469 - val_loss: 0.3533 Epoch 84/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8625 - loss: 0.3351 - val_accuracy: 0.8462 - val_loss: 0.3537 Epoch 85/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8700 - loss: 0.3226 - val_accuracy: 0.8469 - val_loss: 0.3549 Epoch 86/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8685 - loss: 0.3237 - val_accuracy: 0.8462 - val_loss: 0.3531 Epoch 87/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8677 - loss: 0.3196 - val_accuracy: 0.8512 - val_loss: 0.3532 Epoch 88/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8698 - loss: 0.3176 - val_accuracy: 0.8500 - val_loss: 0.3546 Epoch 89/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8723 - loss: 0.3245 - val_accuracy: 0.8462 - val_loss: 0.3529 Epoch 90/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8745 - loss: 0.3164 - val_accuracy: 0.8506 - val_loss: 0.3523 Epoch 91/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8679 - loss: 0.3209 - val_accuracy: 0.8481 - val_loss: 0.3528 Epoch 92/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8710 - loss: 0.3188 - val_accuracy: 0.8487 - val_loss: 0.3529 Epoch 93/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8666 - loss: 0.3263 - val_accuracy: 0.8500 - val_loss: 0.3526 Epoch 94/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8671 - loss: 0.3315 - val_accuracy: 0.8475 - val_loss: 0.3536 Epoch 95/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8626 - loss: 0.3316 - val_accuracy: 0.8469 - val_loss: 0.3535 Epoch 96/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8728 - loss: 0.3156 - val_accuracy: 0.8450 - val_loss: 0.3559 Epoch 97/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8742 - loss: 0.3164 - val_accuracy: 0.8475 - val_loss: 0.3527 Epoch 98/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8692 - loss: 0.3321 - val_accuracy: 0.8487 - val_loss: 0.3525 Epoch 99/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8691 - loss: 0.3249 - val_accuracy: 0.8487 - val_loss: 0.3527 Epoch 100/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8657 - loss: 0.3314 - val_accuracy: 0.8494 - val_loss: 0.3516
#Checking the loss visualization
plt.title('Visualization of Loss and Valuation Loss Performance With Adam Optimizer')
plt.plot(history_adm.history['loss'])
plt.plot(history_adm.history['val_loss']);
#checking the accuracy visualization
plt.title('Visualization of Accuracy and Valuation Accuracy Performance With Adam Optimizer')
plt.plot(history_adm.history['accuracy'])
plt.plot(history_adm.history['val_accuracy']);
y_log_adm = model_adm.predict(X_test_adm)
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step
y_pred_adm = np.where(y_log_adm > 0.5, 1, 0)
accuracy_score(y_test_adm, y_pred_adm)
0.863
print(classification_report(y_test_adm, y_pred_adm))
precision recall f1-score support
0 0.88 0.96 0.92 1585
1 0.78 0.47 0.59 415
accuracy 0.86 2000
macro avg 0.83 0.72 0.75 2000
weighted avg 0.86 0.86 0.85 2000
Observation:
The model generates a churn prediction score or probability of 86.80%, indicating the likelihood of the customer churning.
Model Improvement: Neural Network model with Dropout Optimizer¶
X_train_dpo,X_test_dpo,y_train_dpo,y_test_dpo=train_test_split(X,y,test_size=0.2,random_state=1)
# check the shape of X_train & X_test, y_train & y_test
print("X_train Shape : ", X_train_dpo.shape)
print("X_test Shape : ", X_test_dpo.shape)
print("y_train Shape : ", y_train_dpo.shape)
print("y_test Shape : ", y_test_dpo.shape)
X_train Shape : (8000, 11) X_test Shape : (2000, 11) y_train Shape : (8000,) y_test Shape : (2000,)
# use sklearn standard scaler technique for standarized the input features values
scaler = StandardScaler()
X_train_scaled_dpo = scaler.fit_transform(X_train_dpo)
X_test_scaled_dpo = scaler.transform(X_test_dpo)
X_train_scaled_dpo
array([[-0.23082038, -0.94449979, -0.70174202, ..., 0.91509065,
1.71490137, -0.57273139],
[-0.25150912, -0.94449979, -0.35520275, ..., -1.09278791,
-0.58312392, -0.57273139],
[-0.3963303 , 0.77498705, 0.33787579, ..., -1.09278791,
1.71490137, -0.57273139],
...,
[ 0.22433188, 0.58393295, 1.3774936 , ..., -1.09278791,
-0.58312392, -0.57273139],
[ 0.13123255, 0.01077067, 1.03095433, ..., -1.09278791,
-0.58312392, -0.57273139],
[ 1.1656695 , 0.29735181, 0.33787579, ..., 0.91509065,
1.71490137, -0.57273139]])
X_test_scaled_dpo
array([[-1.03768121, 0.77498705, -1.0482813 , ..., 0.91509065,
-0.58312392, -0.57273139],
[ 0.30708683, -0.46686456, -0.70174202, ..., 0.91509065,
-0.58312392, -0.57273139],
[-1.23422423, 0.29735181, -1.0482813 , ..., -1.09278791,
-0.58312392, -0.57273139],
...,
[-0.86182692, -0.46686456, 1.72403288, ..., 0.91509065,
-0.58312392, 1.74601919],
[-0.30323097, -0.84897275, -1.0482813 , ..., -1.09278791,
1.71490137, -0.57273139],
[ 0.04847759, 1.25262228, 1.3774936 , ..., 0.91509065,
1.71490137, -0.57273139]])
# Define the model
model1 = Sequential()
model1.add(Dense(11, activation='relu', input_shape=(11,)))
model1.add(Dropout(0.5))
model1.add(Dense(11, activation='relu'))
model1.add(Dropout(0.2))
model1.add(Dense(1, activation='sigmoid'))
model1.summary()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense_6 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 11) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_7 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (Dropout) │ (None, 11) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_8 (Dense) │ (None, 1) │ 12 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 276 (1.08 KB)
Trainable params: 276 (1.08 KB)
Non-trainable params: 0 (0.00 B)
# Compile the model
model1.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
history_dpo = model1.fit(X_train_scaled_dpo, y_train_dpo, epochs = 100, validation_split=0.2)
Epoch 1/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.7721 - loss: 0.5769 - val_accuracy: 0.7975 - val_loss: 0.4869 Epoch 2/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7931 - loss: 0.5135 - val_accuracy: 0.7975 - val_loss: 0.4624 Epoch 3/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7967 - loss: 0.4910 - val_accuracy: 0.7975 - val_loss: 0.4496 Epoch 4/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7901 - loss: 0.4768 - val_accuracy: 0.7975 - val_loss: 0.4438 Epoch 5/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.7969 - loss: 0.4765 - val_accuracy: 0.7975 - val_loss: 0.4375 Epoch 6/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8021 - loss: 0.4576 - val_accuracy: 0.7975 - val_loss: 0.4360 Epoch 7/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7996 - loss: 0.4609 - val_accuracy: 0.7975 - val_loss: 0.4338 Epoch 8/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8022 - loss: 0.4491 - val_accuracy: 0.7975 - val_loss: 0.4297 Epoch 9/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8100 - loss: 0.4407 - val_accuracy: 0.7975 - val_loss: 0.4279 Epoch 10/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8072 - loss: 0.4535 - val_accuracy: 0.7975 - val_loss: 0.4281 Epoch 11/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8094 - loss: 0.4457 - val_accuracy: 0.7975 - val_loss: 0.4269 Epoch 12/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8029 - loss: 0.4539 - val_accuracy: 0.7975 - val_loss: 0.4268 Epoch 13/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8066 - loss: 0.4470 - val_accuracy: 0.7975 - val_loss: 0.4261 Epoch 14/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8105 - loss: 0.4409 - val_accuracy: 0.7975 - val_loss: 0.4245 Epoch 15/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8001 - loss: 0.4555 - val_accuracy: 0.7975 - val_loss: 0.4233 Epoch 16/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8112 - loss: 0.4434 - val_accuracy: 0.7969 - val_loss: 0.4228 Epoch 17/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8032 - loss: 0.4445 - val_accuracy: 0.7975 - val_loss: 0.4227 Epoch 18/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8077 - loss: 0.4430 - val_accuracy: 0.7994 - val_loss: 0.4207 Epoch 19/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8122 - loss: 0.4393 - val_accuracy: 0.7981 - val_loss: 0.4211 Epoch 20/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8036 - loss: 0.4474 - val_accuracy: 0.8012 - val_loss: 0.4202 Epoch 21/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8044 - loss: 0.4479 - val_accuracy: 0.8031 - val_loss: 0.4207 Epoch 22/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7976 - loss: 0.4587 - val_accuracy: 0.7987 - val_loss: 0.4214 Epoch 23/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8030 - loss: 0.4365 - val_accuracy: 0.8012 - val_loss: 0.4208 Epoch 24/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8120 - loss: 0.4313 - val_accuracy: 0.8025 - val_loss: 0.4199 Epoch 25/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8013 - loss: 0.4429 - val_accuracy: 0.8025 - val_loss: 0.4183 Epoch 26/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8121 - loss: 0.4313 - val_accuracy: 0.8019 - val_loss: 0.4189 Epoch 27/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8068 - loss: 0.4467 - val_accuracy: 0.8031 - val_loss: 0.4195 Epoch 28/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8102 - loss: 0.4395 - val_accuracy: 0.8044 - val_loss: 0.4203 Epoch 29/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8121 - loss: 0.4341 - val_accuracy: 0.8044 - val_loss: 0.4198 Epoch 30/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8153 - loss: 0.4287 - val_accuracy: 0.8069 - val_loss: 0.4207 Epoch 31/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8213 - loss: 0.4268 - val_accuracy: 0.8050 - val_loss: 0.4198 Epoch 32/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8007 - loss: 0.4404 - val_accuracy: 0.8062 - val_loss: 0.4170 Epoch 33/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8091 - loss: 0.4445 - val_accuracy: 0.8019 - val_loss: 0.4218 Epoch 34/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8033 - loss: 0.4454 - val_accuracy: 0.8025 - val_loss: 0.4158 Epoch 35/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7994 - loss: 0.4405 - val_accuracy: 0.8031 - val_loss: 0.4152 Epoch 36/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8194 - loss: 0.4245 - val_accuracy: 0.8081 - val_loss: 0.4158 Epoch 37/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8067 - loss: 0.4344 - val_accuracy: 0.8081 - val_loss: 0.4109 Epoch 38/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8132 - loss: 0.4169 - val_accuracy: 0.8081 - val_loss: 0.4150 Epoch 39/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8152 - loss: 0.4255 - val_accuracy: 0.8156 - val_loss: 0.4063 Epoch 40/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8222 - loss: 0.4100 - val_accuracy: 0.8256 - val_loss: 0.4043 Epoch 41/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8090 - loss: 0.4401 - val_accuracy: 0.8244 - val_loss: 0.3987 Epoch 42/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8233 - loss: 0.4071 - val_accuracy: 0.8288 - val_loss: 0.3993 Epoch 43/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8180 - loss: 0.4161 - val_accuracy: 0.8275 - val_loss: 0.3971 Epoch 44/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8177 - loss: 0.4202 - val_accuracy: 0.8281 - val_loss: 0.3953 Epoch 45/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8180 - loss: 0.4206 - val_accuracy: 0.8306 - val_loss: 0.3931 Epoch 46/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8204 - loss: 0.4174 - val_accuracy: 0.8269 - val_loss: 0.3931 Epoch 47/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8226 - loss: 0.4093 - val_accuracy: 0.8306 - val_loss: 0.3927 Epoch 48/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8157 - loss: 0.4181 - val_accuracy: 0.8288 - val_loss: 0.3908 Epoch 49/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8135 - loss: 0.4194 - val_accuracy: 0.8256 - val_loss: 0.3921 Epoch 50/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8209 - loss: 0.4124 - val_accuracy: 0.8269 - val_loss: 0.3907 Epoch 51/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8177 - loss: 0.4244 - val_accuracy: 0.8281 - val_loss: 0.3902 Epoch 52/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8184 - loss: 0.4140 - val_accuracy: 0.8288 - val_loss: 0.3916 Epoch 53/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8154 - loss: 0.4205 - val_accuracy: 0.8238 - val_loss: 0.3898 Epoch 54/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8158 - loss: 0.4157 - val_accuracy: 0.8244 - val_loss: 0.3933 Epoch 55/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8189 - loss: 0.4078 - val_accuracy: 0.8238 - val_loss: 0.3897 Epoch 56/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8160 - loss: 0.4184 - val_accuracy: 0.8256 - val_loss: 0.3905 Epoch 57/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8268 - loss: 0.4033 - val_accuracy: 0.8300 - val_loss: 0.3915 Epoch 58/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8247 - loss: 0.4082 - val_accuracy: 0.8300 - val_loss: 0.3897 Epoch 59/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8230 - loss: 0.4134 - val_accuracy: 0.8313 - val_loss: 0.3907 Epoch 60/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8122 - loss: 0.4338 - val_accuracy: 0.8263 - val_loss: 0.3869 Epoch 61/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8202 - loss: 0.4196 - val_accuracy: 0.8281 - val_loss: 0.3899 Epoch 62/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8199 - loss: 0.4059 - val_accuracy: 0.8269 - val_loss: 0.3900 Epoch 63/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8248 - loss: 0.4071 - val_accuracy: 0.8288 - val_loss: 0.3867 Epoch 64/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8156 - loss: 0.4209 - val_accuracy: 0.8281 - val_loss: 0.3866 Epoch 65/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8163 - loss: 0.4138 - val_accuracy: 0.8263 - val_loss: 0.3868 Epoch 66/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8256 - loss: 0.4077 - val_accuracy: 0.8288 - val_loss: 0.3875 Epoch 67/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8154 - loss: 0.4237 - val_accuracy: 0.8294 - val_loss: 0.3857 Epoch 68/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8120 - loss: 0.4189 - val_accuracy: 0.8269 - val_loss: 0.3888 Epoch 69/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8186 - loss: 0.4161 - val_accuracy: 0.8269 - val_loss: 0.3891 Epoch 70/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8316 - loss: 0.3950 - val_accuracy: 0.8300 - val_loss: 0.3889 Epoch 71/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8188 - loss: 0.4140 - val_accuracy: 0.8269 - val_loss: 0.3857 Epoch 72/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8175 - loss: 0.4164 - val_accuracy: 0.8313 - val_loss: 0.3859 Epoch 73/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8259 - loss: 0.4048 - val_accuracy: 0.8319 - val_loss: 0.3836 Epoch 74/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8246 - loss: 0.4116 - val_accuracy: 0.8319 - val_loss: 0.3860 Epoch 75/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8261 - loss: 0.4178 - val_accuracy: 0.8363 - val_loss: 0.3835 Epoch 76/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8288 - loss: 0.4073 - val_accuracy: 0.8344 - val_loss: 0.3849 Epoch 77/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8286 - loss: 0.4052 - val_accuracy: 0.8325 - val_loss: 0.3859 Epoch 78/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8243 - loss: 0.4094 - val_accuracy: 0.8313 - val_loss: 0.3857 Epoch 79/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8251 - loss: 0.4089 - val_accuracy: 0.8300 - val_loss: 0.3895 Epoch 80/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8213 - loss: 0.4111 - val_accuracy: 0.8294 - val_loss: 0.3878 Epoch 81/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8217 - loss: 0.4122 - val_accuracy: 0.8294 - val_loss: 0.3815 Epoch 82/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8281 - loss: 0.4057 - val_accuracy: 0.8325 - val_loss: 0.3868 Epoch 83/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8237 - loss: 0.4117 - val_accuracy: 0.8344 - val_loss: 0.3874 Epoch 84/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8245 - loss: 0.4095 - val_accuracy: 0.8331 - val_loss: 0.3903 Epoch 85/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8222 - loss: 0.4066 - val_accuracy: 0.8344 - val_loss: 0.3854 Epoch 86/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8267 - loss: 0.4105 - val_accuracy: 0.8288 - val_loss: 0.3846 Epoch 87/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8174 - loss: 0.4236 - val_accuracy: 0.8338 - val_loss: 0.3856 Epoch 88/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8175 - loss: 0.4180 - val_accuracy: 0.8313 - val_loss: 0.3839 Epoch 89/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8210 - loss: 0.4133 - val_accuracy: 0.8325 - val_loss: 0.3849 Epoch 90/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8280 - loss: 0.4143 - val_accuracy: 0.8325 - val_loss: 0.3833 Epoch 91/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8300 - loss: 0.4045 - val_accuracy: 0.8319 - val_loss: 0.3941 Epoch 92/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8172 - loss: 0.4243 - val_accuracy: 0.8313 - val_loss: 0.3873 Epoch 93/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8218 - loss: 0.4191 - val_accuracy: 0.8294 - val_loss: 0.3860 Epoch 94/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8239 - loss: 0.4065 - val_accuracy: 0.8294 - val_loss: 0.3874 Epoch 95/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8301 - loss: 0.4104 - val_accuracy: 0.8350 - val_loss: 0.3848 Epoch 96/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8188 - loss: 0.4123 - val_accuracy: 0.8319 - val_loss: 0.3884 Epoch 97/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8158 - loss: 0.4331 - val_accuracy: 0.8300 - val_loss: 0.3854 Epoch 98/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8330 - loss: 0.4001 - val_accuracy: 0.8306 - val_loss: 0.3889 Epoch 99/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8268 - loss: 0.4020 - val_accuracy: 0.8356 - val_loss: 0.3851 Epoch 100/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8186 - loss: 0.4158 - val_accuracy: 0.8344 - val_loss: 0.3836
#checking loss visualization
plt.title('Visualization of Loss and Validation Loss Performance With DropOut Optimizer')
plt.plot(history_dpo.history['loss'])
plt.plot(history_dpo.history['val_loss']);
#checking the accuracy visualization
plt.title('Visualization of Accuracy and Validation Accuracy Performance With DropOut Optimizer')
plt.plot(history_dpo.history['accuracy'])
plt.plot(history_dpo.history['val_accuracy']);
y_log1c = model1.predict(X_test_scaled_dpo)
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step
y_pred_dpo = np.where(y_log1c > 0.5, 1, 0)
accuracy_score(y_test_dpo, y_pred_dpo)
0.8305
print(classification_report(y_test_dpo, y_pred_dpo))
precision recall f1-score support
0 0.83 1.00 0.90 1585
1 0.93 0.20 0.33 415
accuracy 0.83 2000
macro avg 0.88 0.60 0.61 2000
weighted avg 0.85 0.83 0.78 2000
Observation:
The model generates a churn prediction score or probability of 82.00%, indicating the likelihood of the customer churning.
Model Improvement: Neural Network model with Hyperparameter Tuning¶
X_train_hpt,X_test_hpt,y_train_hpt,y_test_hpt=train_test_split(X,y,test_size=0.2,random_state=1)
# check the shape of X_train & X_test, y_train & y_test
print("X_train Shape : ", X_train_hpt.shape)
print("X_test Shape : ", X_test_hpt.shape)
print("y_train Shape : ", y_train_hpt.shape)
print("y_test Shape : ", y_test_hpt.shape)
X_train Shape : (8000, 11) X_test Shape : (2000, 11) y_train Shape : (8000,) y_test Shape : (2000,)
# use sklearn standard scaler technique for standarized the input features values
scaler = StandardScaler()
X_train_scaled_hpt = scaler.fit_transform(X_train_hpt)
X_test_scaled_hpt = scaler.transform(X_test_hpt)
# check the X_train_scaled values
X_train_scaled_hpt
array([[-0.23082038, -0.94449979, -0.70174202, ..., 0.91509065,
1.71490137, -0.57273139],
[-0.25150912, -0.94449979, -0.35520275, ..., -1.09278791,
-0.58312392, -0.57273139],
[-0.3963303 , 0.77498705, 0.33787579, ..., -1.09278791,
1.71490137, -0.57273139],
...,
[ 0.22433188, 0.58393295, 1.3774936 , ..., -1.09278791,
-0.58312392, -0.57273139],
[ 0.13123255, 0.01077067, 1.03095433, ..., -1.09278791,
-0.58312392, -0.57273139],
[ 1.1656695 , 0.29735181, 0.33787579, ..., 0.91509065,
1.71490137, -0.57273139]])
X_test_scaled_hpt
array([[-1.03768121, 0.77498705, -1.0482813 , ..., 0.91509065,
-0.58312392, -0.57273139],
[ 0.30708683, -0.46686456, -0.70174202, ..., 0.91509065,
-0.58312392, -0.57273139],
[-1.23422423, 0.29735181, -1.0482813 , ..., -1.09278791,
-0.58312392, -0.57273139],
...,
[-0.86182692, -0.46686456, 1.72403288, ..., 0.91509065,
-0.58312392, 1.74601919],
[-0.30323097, -0.84897275, -1.0482813 , ..., -1.09278791,
1.71490137, -0.57273139],
[ 0.04847759, 1.25262228, 1.3774936 , ..., 0.91509065,
1.71490137, -0.57273139]])
# Define the hyperparameter space
hyperparameters = {
'activation': ['relu', 'sigmoid'],
'learning_rate': [0.001,0.01,0.1,0.05]
}
# Create the neural network model
model_hpt = Sequential()
model_hpt.add(Dense(11, activation='relu', input_dim = 11))
model_hpt.add(Dense(11, activation='relu'))
model_hpt.add(Dense(1, activation='sigmoid'))
model_hpt.summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense_9 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_10 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_11 (Dense) │ (None, 1) │ 12 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 276 (1.08 KB)
Trainable params: 276 (1.08 KB)
Non-trainable params: 0 (0.00 B)
# Compile the model
model_hpt.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
history_hpt = model_hpt.fit(X_train_scaled_hpt, y_train_hpt, epochs = 100, validation_split=0.2)
Epoch 1/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.7786 - loss: 0.5586 - val_accuracy: 0.7975 - val_loss: 0.4793 Epoch 2/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8011 - loss: 0.4680 - val_accuracy: 0.8050 - val_loss: 0.4415 Epoch 3/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8196 - loss: 0.4274 - val_accuracy: 0.8200 - val_loss: 0.4241 Epoch 4/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8294 - loss: 0.4116 - val_accuracy: 0.8275 - val_loss: 0.4119 Epoch 5/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8371 - loss: 0.4036 - val_accuracy: 0.8313 - val_loss: 0.4012 Epoch 6/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8365 - loss: 0.4038 - val_accuracy: 0.8325 - val_loss: 0.3917 Epoch 7/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8384 - loss: 0.4013 - val_accuracy: 0.8406 - val_loss: 0.3794 Epoch 8/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8556 - loss: 0.3601 - val_accuracy: 0.8413 - val_loss: 0.3686 Epoch 9/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8635 - loss: 0.3554 - val_accuracy: 0.8469 - val_loss: 0.3581 Epoch 10/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8566 - loss: 0.3552 - val_accuracy: 0.8475 - val_loss: 0.3565 Epoch 11/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8553 - loss: 0.3600 - val_accuracy: 0.8519 - val_loss: 0.3517 Epoch 12/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8681 - loss: 0.3374 - val_accuracy: 0.8475 - val_loss: 0.3504 Epoch 13/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8630 - loss: 0.3450 - val_accuracy: 0.8475 - val_loss: 0.3487 Epoch 14/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8610 - loss: 0.3481 - val_accuracy: 0.8475 - val_loss: 0.3468 Epoch 15/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8547 - loss: 0.3459 - val_accuracy: 0.8494 - val_loss: 0.3478 Epoch 16/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8520 - loss: 0.3472 - val_accuracy: 0.8512 - val_loss: 0.3461 Epoch 17/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8610 - loss: 0.3380 - val_accuracy: 0.8475 - val_loss: 0.3448 Epoch 18/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8565 - loss: 0.3402 - val_accuracy: 0.8494 - val_loss: 0.3458 Epoch 19/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8598 - loss: 0.3317 - val_accuracy: 0.8487 - val_loss: 0.3452 Epoch 20/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8566 - loss: 0.3469 - val_accuracy: 0.8487 - val_loss: 0.3448 Epoch 21/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8616 - loss: 0.3305 - val_accuracy: 0.8506 - val_loss: 0.3467 Epoch 22/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8688 - loss: 0.3299 - val_accuracy: 0.8500 - val_loss: 0.3447 Epoch 23/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8650 - loss: 0.3211 - val_accuracy: 0.8494 - val_loss: 0.3455 Epoch 24/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8647 - loss: 0.3267 - val_accuracy: 0.8512 - val_loss: 0.3437 Epoch 25/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8621 - loss: 0.3345 - val_accuracy: 0.8525 - val_loss: 0.3466 Epoch 26/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8671 - loss: 0.3185 - val_accuracy: 0.8506 - val_loss: 0.3454 Epoch 27/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8637 - loss: 0.3341 - val_accuracy: 0.8525 - val_loss: 0.3446 Epoch 28/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8633 - loss: 0.3270 - val_accuracy: 0.8487 - val_loss: 0.3455 Epoch 29/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8622 - loss: 0.3326 - val_accuracy: 0.8531 - val_loss: 0.3448 Epoch 30/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8534 - loss: 0.3391 - val_accuracy: 0.8531 - val_loss: 0.3452 Epoch 31/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8620 - loss: 0.3324 - val_accuracy: 0.8531 - val_loss: 0.3445 Epoch 32/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8628 - loss: 0.3234 - val_accuracy: 0.8512 - val_loss: 0.3466 Epoch 33/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8607 - loss: 0.3357 - val_accuracy: 0.8500 - val_loss: 0.3455 Epoch 34/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8646 - loss: 0.3228 - val_accuracy: 0.8506 - val_loss: 0.3469 Epoch 35/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8649 - loss: 0.3222 - val_accuracy: 0.8531 - val_loss: 0.3464 Epoch 36/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8693 - loss: 0.3305 - val_accuracy: 0.8562 - val_loss: 0.3457 Epoch 37/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8636 - loss: 0.3253 - val_accuracy: 0.8512 - val_loss: 0.3459 Epoch 38/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8652 - loss: 0.3230 - val_accuracy: 0.8544 - val_loss: 0.3439 Epoch 39/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8636 - loss: 0.3169 - val_accuracy: 0.8519 - val_loss: 0.3457 Epoch 40/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8590 - loss: 0.3373 - val_accuracy: 0.8544 - val_loss: 0.3450 Epoch 41/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8610 - loss: 0.3350 - val_accuracy: 0.8544 - val_loss: 0.3439 Epoch 42/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8632 - loss: 0.3281 - val_accuracy: 0.8537 - val_loss: 0.3465 Epoch 43/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8653 - loss: 0.3285 - val_accuracy: 0.8544 - val_loss: 0.3445 Epoch 44/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8621 - loss: 0.3283 - val_accuracy: 0.8519 - val_loss: 0.3475 Epoch 45/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8639 - loss: 0.3234 - val_accuracy: 0.8556 - val_loss: 0.3440 Epoch 46/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8658 - loss: 0.3296 - val_accuracy: 0.8562 - val_loss: 0.3447 Epoch 47/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8655 - loss: 0.3229 - val_accuracy: 0.8562 - val_loss: 0.3469 Epoch 48/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8703 - loss: 0.3171 - val_accuracy: 0.8537 - val_loss: 0.3446 Epoch 49/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8636 - loss: 0.3277 - val_accuracy: 0.8562 - val_loss: 0.3463 Epoch 50/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8699 - loss: 0.3150 - val_accuracy: 0.8569 - val_loss: 0.3452 Epoch 51/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8660 - loss: 0.3219 - val_accuracy: 0.8569 - val_loss: 0.3458 Epoch 52/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8641 - loss: 0.3223 - val_accuracy: 0.8575 - val_loss: 0.3460 Epoch 53/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8700 - loss: 0.3162 - val_accuracy: 0.8525 - val_loss: 0.3465 Epoch 54/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8652 - loss: 0.3274 - val_accuracy: 0.8575 - val_loss: 0.3464 Epoch 55/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8637 - loss: 0.3280 - val_accuracy: 0.8556 - val_loss: 0.3468 Epoch 56/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8654 - loss: 0.3199 - val_accuracy: 0.8569 - val_loss: 0.3456 Epoch 57/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8638 - loss: 0.3216 - val_accuracy: 0.8550 - val_loss: 0.3468 Epoch 58/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8622 - loss: 0.3240 - val_accuracy: 0.8575 - val_loss: 0.3465 Epoch 59/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8644 - loss: 0.3228 - val_accuracy: 0.8569 - val_loss: 0.3463 Epoch 60/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8703 - loss: 0.3136 - val_accuracy: 0.8600 - val_loss: 0.3461 Epoch 61/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8632 - loss: 0.3192 - val_accuracy: 0.8587 - val_loss: 0.3462 Epoch 62/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8676 - loss: 0.3179 - val_accuracy: 0.8600 - val_loss: 0.3461 Epoch 63/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8698 - loss: 0.3215 - val_accuracy: 0.8581 - val_loss: 0.3471 Epoch 64/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8637 - loss: 0.3237 - val_accuracy: 0.8575 - val_loss: 0.3481 Epoch 65/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8650 - loss: 0.3189 - val_accuracy: 0.8569 - val_loss: 0.3480 Epoch 66/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8634 - loss: 0.3265 - val_accuracy: 0.8587 - val_loss: 0.3473 Epoch 67/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8608 - loss: 0.3243 - val_accuracy: 0.8569 - val_loss: 0.3492 Epoch 68/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8664 - loss: 0.3214 - val_accuracy: 0.8562 - val_loss: 0.3478 Epoch 69/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8639 - loss: 0.3241 - val_accuracy: 0.8556 - val_loss: 0.3480 Epoch 70/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8680 - loss: 0.3223 - val_accuracy: 0.8600 - val_loss: 0.3470 Epoch 71/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8694 - loss: 0.3201 - val_accuracy: 0.8562 - val_loss: 0.3474 Epoch 72/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8687 - loss: 0.3216 - val_accuracy: 0.8581 - val_loss: 0.3502 Epoch 73/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8636 - loss: 0.3217 - val_accuracy: 0.8556 - val_loss: 0.3474 Epoch 74/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8739 - loss: 0.3071 - val_accuracy: 0.8600 - val_loss: 0.3482 Epoch 75/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8696 - loss: 0.3138 - val_accuracy: 0.8581 - val_loss: 0.3488 Epoch 76/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8673 - loss: 0.3229 - val_accuracy: 0.8587 - val_loss: 0.3468 Epoch 77/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8603 - loss: 0.3268 - val_accuracy: 0.8562 - val_loss: 0.3477 Epoch 78/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8676 - loss: 0.3189 - val_accuracy: 0.8587 - val_loss: 0.3483 Epoch 79/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8677 - loss: 0.3217 - val_accuracy: 0.8600 - val_loss: 0.3474 Epoch 80/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8682 - loss: 0.3185 - val_accuracy: 0.8612 - val_loss: 0.3474 Epoch 81/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8753 - loss: 0.3085 - val_accuracy: 0.8594 - val_loss: 0.3477 Epoch 82/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8703 - loss: 0.3115 - val_accuracy: 0.8587 - val_loss: 0.3478 Epoch 83/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8720 - loss: 0.3146 - val_accuracy: 0.8556 - val_loss: 0.3480 Epoch 84/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8748 - loss: 0.3054 - val_accuracy: 0.8562 - val_loss: 0.3473 Epoch 85/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8681 - loss: 0.3219 - val_accuracy: 0.8544 - val_loss: 0.3492 Epoch 86/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8642 - loss: 0.3334 - val_accuracy: 0.8575 - val_loss: 0.3487 Epoch 87/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8703 - loss: 0.3099 - val_accuracy: 0.8587 - val_loss: 0.3488 Epoch 88/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8724 - loss: 0.3135 - val_accuracy: 0.8594 - val_loss: 0.3476 Epoch 89/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8777 - loss: 0.3046 - val_accuracy: 0.8569 - val_loss: 0.3488 Epoch 90/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8712 - loss: 0.3141 - val_accuracy: 0.8562 - val_loss: 0.3486 Epoch 91/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8661 - loss: 0.3127 - val_accuracy: 0.8612 - val_loss: 0.3488 Epoch 92/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8723 - loss: 0.3144 - val_accuracy: 0.8575 - val_loss: 0.3486 Epoch 93/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8662 - loss: 0.3181 - val_accuracy: 0.8581 - val_loss: 0.3493 Epoch 94/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8688 - loss: 0.3168 - val_accuracy: 0.8525 - val_loss: 0.3491 Epoch 95/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8682 - loss: 0.3174 - val_accuracy: 0.8581 - val_loss: 0.3499 Epoch 96/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8714 - loss: 0.3151 - val_accuracy: 0.8594 - val_loss: 0.3510 Epoch 97/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8610 - loss: 0.3369 - val_accuracy: 0.8587 - val_loss: 0.3482 Epoch 98/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8583 - loss: 0.3362 - val_accuracy: 0.8581 - val_loss: 0.3491 Epoch 99/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8642 - loss: 0.3140 - val_accuracy: 0.8569 - val_loss: 0.3499 Epoch 100/100 200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8689 - loss: 0.3177 - val_accuracy: 0.8562 - val_loss: 0.3515
#checking the loss visualization
plt.title('Visualization of Loss and Validation Loss Performance With Hyperparameter Tuning')
plt.plot(history_hpt.history['loss'])
plt.plot(history_hpt.history['val_loss']);
#checking the accuracy visualization
plt.title('Visualization of Accuracy and Validation Accuracy Performance With Hyperparameter Tuning')
plt.plot(history_hpt.history['accuracy'])
plt.plot(history_hpt.history['val_accuracy']);
y_log_hpt = model_hpt.predict(X_test_scaled_hpt)
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step
y_pred_hpt = np.where(y_log_hpt > 0.5, 1, 0)
accuracy_score(y_test_hpt, y_pred_hpt)
0.8695
print(classification_report(y_test_hpt, y_pred_hpt))
precision recall f1-score support
0 0.88 0.97 0.92 1585
1 0.81 0.48 0.61 415
accuracy 0.87 2000
macro avg 0.85 0.73 0.76 2000
weighted avg 0.86 0.87 0.86 2000
Observation:
The model generates a churn prediction score or probability of 86.55%, indicating the likelihood of the customer churning.
Model Improvement: Neural Network model with balanced data¶
Customer churn datasets often suffer from class imbalance, where the number of churners is significantly smaller than non-churners. To address this issue, we will balance the target column by upsampling the minority class. This ensures that the model does not get biased towards the majority class during training.
# Separate majority and minority classes
majority_class = data[data['Exited'] == 0]
minority_class = data[data['Exited'] == 1]
# Upsample the minority class
upsampled_minority = resample(minority_class,
replace=True, # Sample with replacement
n_samples=len(majority_class), # Match the number of majority class samples
random_state=42) # Set random state for reproducibility
# Combine the upsampled minority class with the majority class
balanced_data = pd.concat([majority_class, upsampled_minority])
# check the target_balanced_data
balanced_data
| CreditScore | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | Gender_Male | Geography_Germany | Geography_Spain | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 608 | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 | False | False | True |
| 3 | 699 | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 | False | False | False |
| 4 | 850 | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 | False | False | True |
| 6 | 822 | 50 | 7 | 0.00 | 2 | 1 | 1 | 10062.80 | 0 | True | False | False |
| 8 | 501 | 44 | 4 | 142051.07 | 2 | 0 | 1 | 74940.50 | 0 | True | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 5701 | 625 | 49 | 4 | 128504.76 | 1 | 1 | 0 | 126812.63 | 1 | False | True | False |
| 9337 | 466 | 47 | 5 | 102085.72 | 1 | 1 | 1 | 183536.24 | 1 | False | True | False |
| 4345 | 720 | 46 | 3 | 97042.60 | 1 | 1 | 1 | 133516.51 | 1 | True | True | False |
| 1085 | 803 | 42 | 5 | 0.00 | 1 | 1 | 0 | 196466.83 | 1 | True | False | False |
| 3694 | 608 | 33 | 4 | 0.00 | 1 | 1 | 0 | 79304.38 | 1 | False | False | False |
15926 rows × 12 columns
# check again the value counts of "Exited" target column after the balancing the data.
data_exit1 = balanced_data['Exited'].value_counts()
data_exit1
| count | |
|---|---|
| Exited | |
| 0 | 7963 |
| 1 | 7963 |
# # let's perform univarient eda on "Exited" target column.
plt.figure(figsize=(10,6))
res=sns.barplot(y=data_exit1, x=data_exit1.index)
res.set_yticklabels(res.get_ymajorticklabels(), fontsize = 16, color='black')
plt.ylabel('No of Customers Who Churn or Not',fontsize = 16, color='black')
plt.xlabel('Churn/Not Churn',fontsize = 16, color='black')
plt.title('Balanced Data - Customers Who Churned or Not',fontsize = 18, color='black')
plt.show()
# # split the feature in x variable and target varible in y variable
y = balanced_data['Exited']
X = balanced_data.drop('Exited',axis=1)
# use sklearn for train_test_split
X_train_bal, X_test_bal, y_train_bal, y_test_bal = train_test_split(X, y, test_size = 0.2, random_state = 1)
# check the shape of X_train & X_test, y_train & y_test
print("X_train Shape : ", X_train_bal.shape)
print("X_test Shape : ", X_test_bal.shape)
print("y_train Shape : ", y_train_bal.shape)
print("y_test Shape : ", y_test_bal.shape)
X_train Shape : (12740, 11) X_test Shape : (3186, 11) y_train Shape : (12740,) y_test Shape : (3186,)
# use sklearn standard scaler technique for standarized the input features values
scaler = StandardScaler()
X_train_scaled_bal = scaler.fit_transform(X_train_bal)
X_test_scaled_bal = scaler.transform(X_test_bal)
# check the X_train_scaled values
X_train_scaled_bal
array([[-0.23415932, 0.17867584, -0.34133383, ..., 0.97442428,
-0.6617577 , -0.54820692],
[ 1.20116567, 0.83930236, 1.36239323, ..., -1.02624701,
-0.6617577 , -0.54820692],
[ 0.05290568, -0.76507633, -1.36357006, ..., 0.97442428,
-0.6617577 , -0.54820692],
...,
[ 0.9756146 , 2.25493062, -1.02282465, ..., 0.97442428,
1.51112711, -0.54820692],
[ 0.47325086, -0.95382676, 1.70313864, ..., -1.02624701,
-0.6617577 , -0.54820692],
[ 0.30921371, 0.93367758, -1.36357006, ..., -1.02624701,
1.51112711, -0.54820692]])
# check the X_est_scaled values
X_test_scaled_bal
array([[-0.37769182, 0.46180149, 0.340157 , ..., 0.97442428,
-0.6617577 , 1.82412874],
[-0.72627074, 0.17867584, -0.68207924, ..., -1.02624701,
1.51112711, -0.54820692],
[ 0.28870907, -0.10444981, -1.02282465, ..., -1.02624701,
-0.6617577 , -0.54820692],
...,
[ 0.50400782, -0.85945155, 0.68090241, ..., 0.97442428,
-0.6617577 , -0.54820692],
[-0.04961753, 1.97180497, 1.36239323, ..., 0.97442428,
-0.6617577 , -0.54820692],
[ 0.47325086, -0.57632589, 0.68090241, ..., -1.02624701,
-0.6617577 , 1.82412874]])
model_bal = Sequential()
model_bal.add(Dense(11, activation='relu', input_dim = 11))
model_bal.add(Dense(11, activation='relu'))
model_bal.add(Dense(1, activation='sigmoid'))
model_bal.summary()
Model: "sequential_4"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense_12 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_13 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_14 (Dense) │ (None, 1) │ 12 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 276 (1.08 KB)
Trainable params: 276 (1.08 KB)
Non-trainable params: 0 (0.00 B)
model_bal.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history_bal = model_bal.fit(X_train_scaled_bal, y_train_bal, epochs = 100, validation_split=0.2)
Epoch 1/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 3s 6ms/step - accuracy: 0.6341 - loss: 0.6438 - val_accuracy: 0.6994 - val_loss: 0.5719 Epoch 2/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7187 - loss: 0.5620 - val_accuracy: 0.7363 - val_loss: 0.5362 Epoch 3/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7398 - loss: 0.5234 - val_accuracy: 0.7469 - val_loss: 0.5124 Epoch 4/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7525 - loss: 0.5093 - val_accuracy: 0.7531 - val_loss: 0.4979 Epoch 5/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7543 - loss: 0.4927 - val_accuracy: 0.7622 - val_loss: 0.4856 Epoch 6/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7677 - loss: 0.4766 - val_accuracy: 0.7649 - val_loss: 0.4781 Epoch 7/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7587 - loss: 0.4824 - val_accuracy: 0.7700 - val_loss: 0.4715 Epoch 8/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7709 - loss: 0.4679 - val_accuracy: 0.7673 - val_loss: 0.4665 Epoch 9/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7732 - loss: 0.4614 - val_accuracy: 0.7688 - val_loss: 0.4629 Epoch 10/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7749 - loss: 0.4570 - val_accuracy: 0.7743 - val_loss: 0.4593 Epoch 11/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7781 - loss: 0.4530 - val_accuracy: 0.7720 - val_loss: 0.4582 Epoch 12/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7805 - loss: 0.4511 - val_accuracy: 0.7739 - val_loss: 0.4569 Epoch 13/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7745 - loss: 0.4531 - val_accuracy: 0.7814 - val_loss: 0.4535 Epoch 14/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7892 - loss: 0.4363 - val_accuracy: 0.7818 - val_loss: 0.4531 Epoch 15/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7911 - loss: 0.4377 - val_accuracy: 0.7814 - val_loss: 0.4516 Epoch 16/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7854 - loss: 0.4411 - val_accuracy: 0.7814 - val_loss: 0.4501 Epoch 17/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7862 - loss: 0.4391 - val_accuracy: 0.7822 - val_loss: 0.4502 Epoch 18/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7909 - loss: 0.4371 - val_accuracy: 0.7865 - val_loss: 0.4489 Epoch 19/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7874 - loss: 0.4375 - val_accuracy: 0.7838 - val_loss: 0.4481 Epoch 20/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7937 - loss: 0.4345 - val_accuracy: 0.7806 - val_loss: 0.4466 Epoch 21/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7914 - loss: 0.4286 - val_accuracy: 0.7834 - val_loss: 0.4456 Epoch 22/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7918 - loss: 0.4380 - val_accuracy: 0.7838 - val_loss: 0.4442 Epoch 23/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7871 - loss: 0.4409 - val_accuracy: 0.7841 - val_loss: 0.4436 Epoch 24/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7895 - loss: 0.4272 - val_accuracy: 0.7857 - val_loss: 0.4442 Epoch 25/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7874 - loss: 0.4356 - val_accuracy: 0.7818 - val_loss: 0.4444 Epoch 26/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7915 - loss: 0.4322 - val_accuracy: 0.7881 - val_loss: 0.4425 Epoch 27/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7941 - loss: 0.4285 - val_accuracy: 0.7877 - val_loss: 0.4444 Epoch 28/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7947 - loss: 0.4266 - val_accuracy: 0.7889 - val_loss: 0.4407 Epoch 29/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7938 - loss: 0.4331 - val_accuracy: 0.7885 - val_loss: 0.4417 Epoch 30/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7935 - loss: 0.4313 - val_accuracy: 0.7857 - val_loss: 0.4426 Epoch 31/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7936 - loss: 0.4281 - val_accuracy: 0.7885 - val_loss: 0.4430 Epoch 32/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7995 - loss: 0.4253 - val_accuracy: 0.7904 - val_loss: 0.4402 Epoch 33/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7902 - loss: 0.4275 - val_accuracy: 0.7845 - val_loss: 0.4386 Epoch 34/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7919 - loss: 0.4251 - val_accuracy: 0.7881 - val_loss: 0.4394 Epoch 35/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7952 - loss: 0.4266 - val_accuracy: 0.7896 - val_loss: 0.4398 Epoch 36/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7955 - loss: 0.4227 - val_accuracy: 0.7916 - val_loss: 0.4377 Epoch 37/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7962 - loss: 0.4227 - val_accuracy: 0.7873 - val_loss: 0.4374 Epoch 38/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7898 - loss: 0.4265 - val_accuracy: 0.7904 - val_loss: 0.4385 Epoch 39/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8001 - loss: 0.4229 - val_accuracy: 0.7932 - val_loss: 0.4382 Epoch 40/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7917 - loss: 0.4342 - val_accuracy: 0.7928 - val_loss: 0.4375 Epoch 41/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7964 - loss: 0.4286 - val_accuracy: 0.7904 - val_loss: 0.4364 Epoch 42/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7979 - loss: 0.4265 - val_accuracy: 0.7900 - val_loss: 0.4366 Epoch 43/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7935 - loss: 0.4339 - val_accuracy: 0.7928 - val_loss: 0.4358 Epoch 44/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7954 - loss: 0.4238 - val_accuracy: 0.7908 - val_loss: 0.4363 Epoch 45/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8025 - loss: 0.4173 - val_accuracy: 0.7916 - val_loss: 0.4359 Epoch 46/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7899 - loss: 0.4301 - val_accuracy: 0.7920 - val_loss: 0.4359 Epoch 47/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8005 - loss: 0.4208 - val_accuracy: 0.7924 - val_loss: 0.4369 Epoch 48/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7988 - loss: 0.4194 - val_accuracy: 0.7951 - val_loss: 0.4361 Epoch 49/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7962 - loss: 0.4235 - val_accuracy: 0.7932 - val_loss: 0.4358 Epoch 50/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7956 - loss: 0.4293 - val_accuracy: 0.7932 - val_loss: 0.4367 Epoch 51/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7952 - loss: 0.4192 - val_accuracy: 0.7928 - val_loss: 0.4358 Epoch 52/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7934 - loss: 0.4265 - val_accuracy: 0.7943 - val_loss: 0.4373 Epoch 53/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7960 - loss: 0.4235 - val_accuracy: 0.7947 - val_loss: 0.4356 Epoch 54/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8040 - loss: 0.4162 - val_accuracy: 0.7928 - val_loss: 0.4358 Epoch 55/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7989 - loss: 0.4168 - val_accuracy: 0.7912 - val_loss: 0.4359 Epoch 56/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8027 - loss: 0.4161 - val_accuracy: 0.7912 - val_loss: 0.4383 Epoch 57/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7961 - loss: 0.4309 - val_accuracy: 0.7912 - val_loss: 0.4351 Epoch 58/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8030 - loss: 0.4169 - val_accuracy: 0.7940 - val_loss: 0.4356 Epoch 59/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7975 - loss: 0.4200 - val_accuracy: 0.7955 - val_loss: 0.4345 Epoch 60/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7909 - loss: 0.4313 - val_accuracy: 0.7947 - val_loss: 0.4341 Epoch 61/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7923 - loss: 0.4320 - val_accuracy: 0.7959 - val_loss: 0.4342 Epoch 62/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8083 - loss: 0.4166 - val_accuracy: 0.7892 - val_loss: 0.4354 Epoch 63/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8000 - loss: 0.4178 - val_accuracy: 0.7955 - val_loss: 0.4344 Epoch 64/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8046 - loss: 0.4127 - val_accuracy: 0.7943 - val_loss: 0.4357 Epoch 65/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8014 - loss: 0.4199 - val_accuracy: 0.7936 - val_loss: 0.4355 Epoch 66/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7974 - loss: 0.4219 - val_accuracy: 0.7991 - val_loss: 0.4351 Epoch 67/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7943 - loss: 0.4245 - val_accuracy: 0.7936 - val_loss: 0.4379 Epoch 68/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8074 - loss: 0.4115 - val_accuracy: 0.7920 - val_loss: 0.4359 Epoch 69/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7995 - loss: 0.4181 - val_accuracy: 0.7959 - val_loss: 0.4362 Epoch 70/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7998 - loss: 0.4189 - val_accuracy: 0.7959 - val_loss: 0.4348 Epoch 71/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8019 - loss: 0.4182 - val_accuracy: 0.7951 - val_loss: 0.4348 Epoch 72/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8013 - loss: 0.4171 - val_accuracy: 0.7932 - val_loss: 0.4347 Epoch 73/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8007 - loss: 0.4233 - val_accuracy: 0.7928 - val_loss: 0.4336 Epoch 74/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7966 - loss: 0.4257 - val_accuracy: 0.7975 - val_loss: 0.4339 Epoch 75/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7998 - loss: 0.4211 - val_accuracy: 0.7947 - val_loss: 0.4337 Epoch 76/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8027 - loss: 0.4137 - val_accuracy: 0.7955 - val_loss: 0.4347 Epoch 77/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8083 - loss: 0.4122 - val_accuracy: 0.7963 - val_loss: 0.4337 Epoch 78/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8025 - loss: 0.4132 - val_accuracy: 0.7940 - val_loss: 0.4342 Epoch 79/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8020 - loss: 0.4139 - val_accuracy: 0.7951 - val_loss: 0.4347 Epoch 80/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8012 - loss: 0.4234 - val_accuracy: 0.7983 - val_loss: 0.4353 Epoch 81/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8075 - loss: 0.4118 - val_accuracy: 0.7932 - val_loss: 0.4339 Epoch 82/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7997 - loss: 0.4231 - val_accuracy: 0.7967 - val_loss: 0.4341 Epoch 83/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8074 - loss: 0.4124 - val_accuracy: 0.7912 - val_loss: 0.4362 Epoch 84/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7949 - loss: 0.4219 - val_accuracy: 0.7943 - val_loss: 0.4346 Epoch 85/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8083 - loss: 0.4114 - val_accuracy: 0.7904 - val_loss: 0.4385 Epoch 86/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8060 - loss: 0.4136 - val_accuracy: 0.7920 - val_loss: 0.4345 Epoch 87/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8053 - loss: 0.4105 - val_accuracy: 0.7998 - val_loss: 0.4338 Epoch 88/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8043 - loss: 0.4153 - val_accuracy: 0.7967 - val_loss: 0.4333 Epoch 89/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8070 - loss: 0.4159 - val_accuracy: 0.7940 - val_loss: 0.4344 Epoch 90/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8062 - loss: 0.4156 - val_accuracy: 0.7967 - val_loss: 0.4348 Epoch 91/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8030 - loss: 0.4152 - val_accuracy: 0.7936 - val_loss: 0.4360 Epoch 92/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8049 - loss: 0.4095 - val_accuracy: 0.7959 - val_loss: 0.4334 Epoch 93/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7978 - loss: 0.4131 - val_accuracy: 0.7987 - val_loss: 0.4332 Epoch 94/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8090 - loss: 0.4084 - val_accuracy: 0.7971 - val_loss: 0.4345 Epoch 95/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8073 - loss: 0.4136 - val_accuracy: 0.7983 - val_loss: 0.4346 Epoch 96/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8087 - loss: 0.4100 - val_accuracy: 0.7975 - val_loss: 0.4343 Epoch 97/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7981 - loss: 0.4209 - val_accuracy: 0.7979 - val_loss: 0.4344 Epoch 98/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7985 - loss: 0.4150 - val_accuracy: 0.7967 - val_loss: 0.4342 Epoch 99/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8050 - loss: 0.4052 - val_accuracy: 0.7979 - val_loss: 0.4358 Epoch 100/100 319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8090 - loss: 0.4115 - val_accuracy: 0.7995 - val_loss: 0.4332
#checking the loss visualization
plt.title('Visualization of Loss and Validation Loss Performance With Balanced Data')
plt.plot(history_bal.history['loss'])
plt.plot(history_bal.history['val_loss']);
#checking the accuracy visualization
plt.title('Visualization of Accuracy and Validation Accuracy Performance With Balanced Data')
plt.plot(history_bal.history['accuracy'])
plt.plot(history_bal.history['val_accuracy']);
y_log_bal = model_bal.predict(X_test_scaled_bal)
100/100 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step
y_pred_bal = np.where(y_log_bal > 0.5, 1, 0)
accuracy_score(y_test_bal, y_pred_bal)
0.7981795354676711
print(classification_report(y_test_bal, y_pred_bal))
precision recall f1-score support
0 0.79 0.80 0.79 1536
1 0.81 0.80 0.80 1650
accuracy 0.80 3186
macro avg 0.80 0.80 0.80 3186
weighted avg 0.80 0.80 0.80 3186
The model generates a churn prediction score or probability of 80.10%, indicating the likelihood of the customer churning.
Model Improvement: Neural Network model with Batch Normalization¶
# use sklearn for train_test_split
X_train_bn, X_test_bn, y_train_bn, y_test_bn = train_test_split(X, y, test_size = 0.2, random_state = 42)
# check the shape of X_train & X_test, y_train & y_test
print("X_train Shape : ", X_train_bn.shape)
print("X_test Shape : ", X_test_bn.shape)
print("y_train Shape : ", y_train_bn.shape)
print("y_test Shape : ", y_test_bn.shape)
X_train Shape : (12740, 11) X_test Shape : (3186, 11) y_train Shape : (12740,) y_test Shape : (3186,)
# use sklearn standard scaler technique for standarized the input features values
scaler = StandardScaler()
X_train_scaled_bn = scaler.fit_transform(X_train_bn)
X_test_scaled_bn = scaler.transform(X_test_bn)
X_train_scaled_bn
array([[ 1.74458075e+00, 2.64948525e-01, 1.02260197e+00, ...,
9.74118179e-01, -6.65070356e-01, -5.47359263e-01],
[ 1.24130320e+00, 7.61019204e-02, 8.82175645e-04, ...,
9.74118179e-01, -6.65070356e-01, -5.47359263e-01],
[ 8.06835512e-02, -1.83213817e-02, 1.02260197e+00, ...,
9.74118179e-01, 1.50360032e+00, -5.47359263e-01],
...,
[ 3.16915869e-01, -8.68131101e-01, 8.82175645e-04, ...,
-1.02656949e+00, -6.65070356e-01, -5.47359263e-01],
[ 4.60709454e-01, -4.90437892e-01, -3.39691089e-01, ...,
-1.02656949e+00, -6.65070356e-01, -5.47359263e-01],
[-2.68529441e-01, -1.12744684e-01, 1.70374850e+00, ...,
9.74118179e-01, -6.65070356e-01, -5.47359263e-01]])
X_test_scaled_bn
array([[ 0.46070945, 0.45379513, -1.02083762, ..., 0.97411818,
-0.66507036, 1.82695364],
[-1.34698133, -0.39601459, 1.02260197, ..., -1.02656949,
-0.66507036, -0.54735926],
[-1.19291677, -0.39601459, -1.36141088, ..., 0.97411818,
-0.66507036, 1.82695364],
...,
[-0.11446489, -1.0569777 , 1.02260197, ..., 0.97411818,
1.50360032, -0.54735926],
[-1.0799361 , -0.7737078 , 1.02260197, ..., 0.97411818,
1.50360032, -0.54735926],
[-2.4665171 , -0.39601459, -1.36141088, ..., 0.97411818,
-0.66507036, 1.82695364]])
model_bn = tf.keras.Sequential([
tf.keras.layers.Dense(11, input_shape=(11,), activation='relu'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(6, activation='relu'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model_bn.summary()
Model: "sequential_5"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ dense_15 (Dense) │ (None, 11) │ 132 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization │ (None, 11) │ 44 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_16 (Dense) │ (None, 6) │ 72 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_1 │ (None, 6) │ 24 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_17 (Dense) │ (None, 1) │ 7 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 279 (1.09 KB)
Trainable params: 245 (980.00 B)
Non-trainable params: 34 (136.00 B)
model_bn.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history_bn = model_bn.fit(X_train_scaled_bn, y_train_bn, epochs = 100, validation_split=0.25)
Epoch 1/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 5s 8ms/step - accuracy: 0.6031 - loss: 0.6740 - val_accuracy: 0.6936 - val_loss: 0.6003 Epoch 2/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.6921 - loss: 0.5912 - val_accuracy: 0.7256 - val_loss: 0.5540 Epoch 3/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7227 - loss: 0.5513 - val_accuracy: 0.7413 - val_loss: 0.5227 Epoch 4/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7368 - loss: 0.5237 - val_accuracy: 0.7545 - val_loss: 0.5051 Epoch 5/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7509 - loss: 0.5110 - val_accuracy: 0.7604 - val_loss: 0.4949 Epoch 6/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7557 - loss: 0.5034 - val_accuracy: 0.7582 - val_loss: 0.4876 Epoch 7/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7434 - loss: 0.5039 - val_accuracy: 0.7661 - val_loss: 0.4815 Epoch 8/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7523 - loss: 0.4955 - val_accuracy: 0.7699 - val_loss: 0.4758 Epoch 9/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7589 - loss: 0.4921 - val_accuracy: 0.7689 - val_loss: 0.4718 Epoch 10/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7652 - loss: 0.4811 - val_accuracy: 0.7746 - val_loss: 0.4685 Epoch 11/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7646 - loss: 0.4797 - val_accuracy: 0.7714 - val_loss: 0.4654 Epoch 12/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7635 - loss: 0.4796 - val_accuracy: 0.7749 - val_loss: 0.4634 Epoch 13/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7613 - loss: 0.4815 - val_accuracy: 0.7771 - val_loss: 0.4618 Epoch 14/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7655 - loss: 0.4749 - val_accuracy: 0.7790 - val_loss: 0.4610 Epoch 15/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7624 - loss: 0.4721 - val_accuracy: 0.7790 - val_loss: 0.4615 Epoch 16/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7679 - loss: 0.4727 - val_accuracy: 0.7777 - val_loss: 0.4580 Epoch 17/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7705 - loss: 0.4678 - val_accuracy: 0.7821 - val_loss: 0.4584 Epoch 18/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7699 - loss: 0.4674 - val_accuracy: 0.7808 - val_loss: 0.4552 Epoch 19/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7834 - loss: 0.4525 - val_accuracy: 0.7790 - val_loss: 0.4565 Epoch 20/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7768 - loss: 0.4616 - val_accuracy: 0.7821 - val_loss: 0.4548 Epoch 21/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7769 - loss: 0.4626 - val_accuracy: 0.7846 - val_loss: 0.4534 Epoch 22/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7841 - loss: 0.4534 - val_accuracy: 0.7890 - val_loss: 0.4519 Epoch 23/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7750 - loss: 0.4604 - val_accuracy: 0.7840 - val_loss: 0.4514 Epoch 24/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7743 - loss: 0.4657 - val_accuracy: 0.7840 - val_loss: 0.4510 Epoch 25/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7763 - loss: 0.4595 - val_accuracy: 0.7815 - val_loss: 0.4498 Epoch 26/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7741 - loss: 0.4606 - val_accuracy: 0.7893 - val_loss: 0.4487 Epoch 27/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7846 - loss: 0.4506 - val_accuracy: 0.7868 - val_loss: 0.4508 Epoch 28/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7830 - loss: 0.4538 - val_accuracy: 0.7871 - val_loss: 0.4489 Epoch 29/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7778 - loss: 0.4601 - val_accuracy: 0.7887 - val_loss: 0.4490 Epoch 30/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7800 - loss: 0.4536 - val_accuracy: 0.7887 - val_loss: 0.4477 Epoch 31/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7759 - loss: 0.4543 - val_accuracy: 0.7849 - val_loss: 0.4479 Epoch 32/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7800 - loss: 0.4564 - val_accuracy: 0.7849 - val_loss: 0.4460 Epoch 33/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7817 - loss: 0.4546 - val_accuracy: 0.7890 - val_loss: 0.4477 Epoch 34/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7756 - loss: 0.4632 - val_accuracy: 0.7909 - val_loss: 0.4474 Epoch 35/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7818 - loss: 0.4554 - val_accuracy: 0.7928 - val_loss: 0.4495 Epoch 36/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7763 - loss: 0.4552 - val_accuracy: 0.7887 - val_loss: 0.4468 Epoch 37/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7686 - loss: 0.4560 - val_accuracy: 0.7862 - val_loss: 0.4469 Epoch 38/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7775 - loss: 0.4560 - val_accuracy: 0.7909 - val_loss: 0.4465 Epoch 39/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7808 - loss: 0.4588 - val_accuracy: 0.7868 - val_loss: 0.4472 Epoch 40/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7797 - loss: 0.4515 - val_accuracy: 0.7865 - val_loss: 0.4477 Epoch 41/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7820 - loss: 0.4555 - val_accuracy: 0.7837 - val_loss: 0.4465 Epoch 42/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7841 - loss: 0.4530 - val_accuracy: 0.7840 - val_loss: 0.4463 Epoch 43/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7867 - loss: 0.4436 - val_accuracy: 0.7896 - val_loss: 0.4466 Epoch 44/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7830 - loss: 0.4525 - val_accuracy: 0.7896 - val_loss: 0.4474 Epoch 45/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7786 - loss: 0.4548 - val_accuracy: 0.7865 - val_loss: 0.4458 Epoch 46/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7833 - loss: 0.4459 - val_accuracy: 0.7859 - val_loss: 0.4463 Epoch 47/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7822 - loss: 0.4500 - val_accuracy: 0.7887 - val_loss: 0.4460 Epoch 48/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7891 - loss: 0.4454 - val_accuracy: 0.7912 - val_loss: 0.4456 Epoch 49/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7864 - loss: 0.4482 - val_accuracy: 0.7893 - val_loss: 0.4443 Epoch 50/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7813 - loss: 0.4616 - val_accuracy: 0.7934 - val_loss: 0.4458 Epoch 51/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7839 - loss: 0.4464 - val_accuracy: 0.7922 - val_loss: 0.4467 Epoch 52/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7880 - loss: 0.4500 - val_accuracy: 0.7950 - val_loss: 0.4460 Epoch 53/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7864 - loss: 0.4496 - val_accuracy: 0.7925 - val_loss: 0.4450 Epoch 54/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7866 - loss: 0.4480 - val_accuracy: 0.7912 - val_loss: 0.4471 Epoch 55/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7803 - loss: 0.4538 - val_accuracy: 0.7896 - val_loss: 0.4462 Epoch 56/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7819 - loss: 0.4559 - val_accuracy: 0.7972 - val_loss: 0.4448 Epoch 57/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7831 - loss: 0.4499 - val_accuracy: 0.7962 - val_loss: 0.4454 Epoch 58/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7773 - loss: 0.4504 - val_accuracy: 0.7934 - val_loss: 0.4446 Epoch 59/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7825 - loss: 0.4478 - val_accuracy: 0.7940 - val_loss: 0.4444 Epoch 60/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7859 - loss: 0.4551 - val_accuracy: 0.7931 - val_loss: 0.4434 Epoch 61/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7840 - loss: 0.4504 - val_accuracy: 0.7890 - val_loss: 0.4449 Epoch 62/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7912 - loss: 0.4428 - val_accuracy: 0.7893 - val_loss: 0.4458 Epoch 63/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7845 - loss: 0.4537 - val_accuracy: 0.7887 - val_loss: 0.4452 Epoch 64/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7909 - loss: 0.4465 - val_accuracy: 0.7918 - val_loss: 0.4462 Epoch 65/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7888 - loss: 0.4389 - val_accuracy: 0.7849 - val_loss: 0.4453 Epoch 66/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7841 - loss: 0.4443 - val_accuracy: 0.7912 - val_loss: 0.4460 Epoch 67/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7867 - loss: 0.4504 - val_accuracy: 0.7915 - val_loss: 0.4464 Epoch 68/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7787 - loss: 0.4536 - val_accuracy: 0.7909 - val_loss: 0.4448 Epoch 69/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7805 - loss: 0.4540 - val_accuracy: 0.7881 - val_loss: 0.4456 Epoch 70/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7795 - loss: 0.4505 - val_accuracy: 0.7887 - val_loss: 0.4440 Epoch 71/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7795 - loss: 0.4560 - val_accuracy: 0.7922 - val_loss: 0.4455 Epoch 72/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7782 - loss: 0.4561 - val_accuracy: 0.7893 - val_loss: 0.4440 Epoch 73/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7837 - loss: 0.4579 - val_accuracy: 0.7868 - val_loss: 0.4440 Epoch 74/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7870 - loss: 0.4467 - val_accuracy: 0.7868 - val_loss: 0.4436 Epoch 75/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7875 - loss: 0.4409 - val_accuracy: 0.7865 - val_loss: 0.4438 Epoch 76/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7847 - loss: 0.4544 - val_accuracy: 0.7878 - val_loss: 0.4445 Epoch 77/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7811 - loss: 0.4539 - val_accuracy: 0.7928 - val_loss: 0.4440 Epoch 78/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7861 - loss: 0.4444 - val_accuracy: 0.7881 - val_loss: 0.4446 Epoch 79/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7818 - loss: 0.4504 - val_accuracy: 0.7900 - val_loss: 0.4430 Epoch 80/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7878 - loss: 0.4360 - val_accuracy: 0.7918 - val_loss: 0.4428 Epoch 81/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7872 - loss: 0.4396 - val_accuracy: 0.7937 - val_loss: 0.4434 Epoch 82/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7905 - loss: 0.4401 - val_accuracy: 0.7893 - val_loss: 0.4437 Epoch 83/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7827 - loss: 0.4414 - val_accuracy: 0.7915 - val_loss: 0.4412 Epoch 84/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7870 - loss: 0.4399 - val_accuracy: 0.7906 - val_loss: 0.4423 Epoch 85/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7950 - loss: 0.4391 - val_accuracy: 0.7937 - val_loss: 0.4419 Epoch 86/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7840 - loss: 0.4465 - val_accuracy: 0.7940 - val_loss: 0.4413 Epoch 87/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7923 - loss: 0.4399 - val_accuracy: 0.7909 - val_loss: 0.4410 Epoch 88/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7778 - loss: 0.4544 - val_accuracy: 0.7918 - val_loss: 0.4424 Epoch 89/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7861 - loss: 0.4416 - val_accuracy: 0.7903 - val_loss: 0.4422 Epoch 90/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7856 - loss: 0.4508 - val_accuracy: 0.7893 - val_loss: 0.4427 Epoch 91/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7880 - loss: 0.4485 - val_accuracy: 0.7925 - val_loss: 0.4413 Epoch 92/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7776 - loss: 0.4545 - val_accuracy: 0.7959 - val_loss: 0.4417 Epoch 93/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7787 - loss: 0.4553 - val_accuracy: 0.7953 - val_loss: 0.4425 Epoch 94/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7877 - loss: 0.4389 - val_accuracy: 0.7931 - val_loss: 0.4416 Epoch 95/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7876 - loss: 0.4461 - val_accuracy: 0.7934 - val_loss: 0.4428 Epoch 96/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7889 - loss: 0.4436 - val_accuracy: 0.7943 - val_loss: 0.4405 Epoch 97/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7872 - loss: 0.4468 - val_accuracy: 0.7900 - val_loss: 0.4423 Epoch 98/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7809 - loss: 0.4518 - val_accuracy: 0.7959 - val_loss: 0.4399 Epoch 99/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7842 - loss: 0.4425 - val_accuracy: 0.7953 - val_loss: 0.4413 Epoch 100/100 299/299 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7843 - loss: 0.4561 - val_accuracy: 0.7925 - val_loss: 0.4404
#checking the loss visualization
plt.title('Visualization of Loss and Validation Loss Performance With Batch Normalization')
plt.plot(history_bn.history['loss'])
plt.plot(history_bn.history['val_loss'])
plt.ylabel('loss')
plt.xlabel('epoch');
#checking the accuracy visualization
plt.title('Visualization of Accuracy and Validation Accuracy Performance With Batch Normalization')
plt.plot(history_bn.history['accuracy'])
plt.plot(history_bn.history['val_accuracy'])
plt.ylabel('accuracy')
plt.xlabel('epoch');
y_log_bn = model_bn.predict(X_test_scaled_bn)
100/100 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step
y_pred_bn = np.where(y_log_bn > 0.5, 1, 0)
accuracy_score(y_test_bn, y_pred_bn)
0.7890772128060264
print(classification_report(y_test_bn, y_pred_bn))
precision recall f1-score support
0 0.78 0.81 0.79 1605
1 0.80 0.77 0.78 1581
accuracy 0.79 3186
macro avg 0.79 0.79 0.79 3186
weighted avg 0.79 0.79 0.79 3186
Observation:
The model generates a churn prediction score or probability of 85.55%, indicating the likelihood of the customer churning.
Step 12 and Final¶
# create a dataframe from the accuracy perfomance of the models
perf_accuracy = {
"Artificial Neural Network Performance - Accuracy Score": 86.60,
"Artificial Neural Network with Adam Optmizer Performance - Accuracy Score": 86.80,
"Artificial Neural Network with DropOut Optimizer Performance - Accuracy Score": 82.00,
"Artificial Neural Network with Hyperparameter Tuning Performance - Accuracy Score": 86.55,
"Artificial Neural Network with Balanced Data Performance - Accuracy Score": 80.10,
"Artificial Neural Network with Batch Normalization Performance - Accuracy Score": 85.55,
}
perf_accuracy = pd.DataFrame.from_dict(
perf_accuracy, orient="index", columns=["Accuracy Scores"]
)
display(perf_accuracy.sort_values("Accuracy Scores"))
| Accuracy Scores | |
|---|---|
| Artificial Neural Network with Balanced Data Performance - Accuracy Score | 80.10 |
| Artificial Neural Network with DropOut Optimizer Performance - Accuracy Score | 82.00 |
| Artificial Neural Network with Batch Normalization Performance - Accuracy Score | 85.55 |
| Artificial Neural Network with Hyperparameter Tuning Performance - Accuracy Score | 86.55 |
| Artificial Neural Network Performance - Accuracy Score | 86.60 |
| Artificial Neural Network with Adam Optmizer Performance - Accuracy Score | 86.80 |
Observation:
All the 6 neural neteork used to analyse the data score 80% and above. However, 3 of them scored above 86% according to the results generated
There are 3 artificial neural network that scored 86% and above; these are
- Artificial Neural Network Performance - Accuracy Score
- Artificial Neural Network with Hyperparameter Tuning Performance - Accuracy Score
- Artificial Neural Network with Adam Optmizer Performance - Accuracy Score
I will endeavor to check the features importance of these models in order to see how to understand the prediction performance in a real life situation.
Use the SHAP code to get the most importance feature of the model. I will use SHAP because our model analysis is a sequential analysis.
pip install shap
Requirement already satisfied: shap in /usr/local/lib/python3.12/dist-packages (0.50.0) Requirement already satisfied: numpy>=2 in /usr/local/lib/python3.12/dist-packages (from shap) (2.0.2) Requirement already satisfied: scipy in /usr/local/lib/python3.12/dist-packages (from shap) (1.16.3) Requirement already satisfied: scikit-learn in /usr/local/lib/python3.12/dist-packages (from shap) (1.6.1) Requirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (from shap) (2.2.2) Requirement already satisfied: tqdm>=4.27.0 in /usr/local/lib/python3.12/dist-packages (from shap) (4.67.1) Requirement already satisfied: packaging>20.9 in /usr/local/lib/python3.12/dist-packages (from shap) (25.0) Requirement already satisfied: slicer==0.0.8 in /usr/local/lib/python3.12/dist-packages (from shap) (0.0.8) Requirement already satisfied: numba>=0.54 in /usr/local/lib/python3.12/dist-packages (from shap) (0.60.0) Requirement already satisfied: cloudpickle in /usr/local/lib/python3.12/dist-packages (from shap) (3.1.2) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.12/dist-packages (from shap) (4.15.0) Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /usr/local/lib/python3.12/dist-packages (from numba>=0.54->shap) (0.43.0) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas->shap) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas->shap) (2025.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas->shap) (2025.2) Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn->shap) (1.5.2) Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.12/dist-packages (from scikit-learn->shap) (3.6.0) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas->shap) (1.17.0)
pip install tensorflow
Requirement already satisfied: tensorflow in /usr/local/lib/python3.12/dist-packages (2.19.0) Requirement already satisfied: absl-py>=1.0.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (1.4.0) Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (1.6.3) Requirement already satisfied: flatbuffers>=24.3.25 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (25.9.23) Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (0.6.0) Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (0.2.0) Requirement already satisfied: libclang>=13.0.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (18.1.1) Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (3.4.0) Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from tensorflow) (25.0) Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.3 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (5.29.5) Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (2.32.4) Requirement already satisfied: setuptools in /usr/local/lib/python3.12/dist-packages (from tensorflow) (75.2.0) Requirement already satisfied: six>=1.12.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (1.17.0) Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (3.2.0) Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (4.15.0) Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (2.0.1) Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (1.76.0) Requirement already satisfied: tensorboard~=2.19.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (2.19.0) Requirement already satisfied: keras>=3.5.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (3.10.0) Requirement already satisfied: numpy<2.2.0,>=1.26.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (2.0.2) Requirement already satisfied: h5py>=3.11.0 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (3.15.1) Requirement already satisfied: ml-dtypes<1.0.0,>=0.5.1 in /usr/local/lib/python3.12/dist-packages (from tensorflow) (0.5.3) Requirement already satisfied: wheel<1.0,>=0.23.0 in /usr/local/lib/python3.12/dist-packages (from astunparse>=1.6.0->tensorflow) (0.45.1) Requirement already satisfied: rich in /usr/local/lib/python3.12/dist-packages (from keras>=3.5.0->tensorflow) (13.9.4) Requirement already satisfied: namex in /usr/local/lib/python3.12/dist-packages (from keras>=3.5.0->tensorflow) (0.1.0) Requirement already satisfied: optree in /usr/local/lib/python3.12/dist-packages (from keras>=3.5.0->tensorflow) (0.17.0) Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.21.0->tensorflow) (3.4.4) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.21.0->tensorflow) (3.11) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.21.0->tensorflow) (2.5.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.21.0->tensorflow) (2025.10.5) Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.12/dist-packages (from tensorboard~=2.19.0->tensorflow) (3.10) Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.12/dist-packages (from tensorboard~=2.19.0->tensorflow) (0.7.2) Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from tensorboard~=2.19.0->tensorflow) (3.1.3) Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.12/dist-packages (from werkzeug>=1.0.1->tensorboard~=2.19.0->tensorflow) (3.0.3) Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.12/dist-packages (from rich->keras>=3.5.0->tensorflow) (4.0.0) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.12/dist-packages (from rich->keras>=3.5.0->tensorflow) (2.19.2) Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.12/dist-packages (from markdown-it-py>=2.2.0->rich->keras>=3.5.0->tensorflow) (0.1.2)
Feature Importance - Model Optimized with Hyperparameter Tuning¶
import shap
from tensorflow.keras import Sequential
import numpy as np
# load JS visualization code to notebook
shap.initjs()
# Scale the entire X DataFrame using the scaler fitted on X_train_hpt
X_scaled = scaler.transform(X)
# explain the model's predictions using SHAP
explainer = shap.DeepExplainer(model_hpt,data=X_scaled[:1500])
# Limit the number of samples to explain for the SHAP values calculation
X_to_explain = X_scaled[:1500]
shap_values_raw = explainer.shap_values(X_to_explain)
print(f"Shape of X_to_explain: {X_to_explain.shape}")
print(f"Shape of shap_values_raw (list of arrays): {len(shap_values_raw)}")
if len(shap_values_raw) > 0:
print(f"Shape of shap_values_raw[0]: {shap_values_raw[0].shape}")
# DeepExplainer can return a list of arrays, where each array corresponds to one output.
# For a single-output model, it should be a list of one array: [shap_values_for_all_samples]
# However, the print statement indicates shap_values_raw is a list of 1500 arrays,
# where each internal array has shape (11, 1). This implies it's per-sample.
# So, we need to stack the individual (11,1) shap_values into a (1500,11) matrix.
shap_values_matrix = np.array([s.squeeze(axis=-1) for s in shap_values_raw])
# For summary_plot, we need features (X) to match the number of samples in shap_values_matrix
# So, we should also take the corresponding subset of X for feature names.
X_for_summary_plot = X.iloc[:X_to_explain.shape[0], :]
# visualize the first prediction's explanation (use matplotlib=True to avoid Javascript)
# Note: X.iloc[0,:] is used here to get feature names for display, assuming order is maintained.
# The `shap_values[0][0,:]` might have an extra dimension of 1 for single-output models, so .squeeze() is used.
# shap.force_plot(explainer.expected_value[0], shap_values_raw[0][0,:].squeeze(), X.iloc[0,:])
# Use X_for_summary_plot (subset of original DataFrame for column names) for the summary plot
shap.summary_plot(shap_values_matrix, X_for_summary_plot, plot_type="bar")
Shape of X_to_explain: (1500, 11) Shape of shap_values_raw (list of arrays): 1500 Shape of shap_values_raw[0]: (11, 1)
Feature Importance - Model - Neural Network¶
import shap
from tensorflow.keras import Sequential
import numpy as np
# load JS visualization code to notebook
shap.initjs()
# Scale the entire X DataFrame using the scaler fitted during previous steps
X_scaled_for_ann = scaler.transform(X)
# explain the model's predictions using SHAP
explainer = shap.DeepExplainer(model_ann,data=X_scaled_for_ann[:1500])
# Limit the number of samples to explain for the SHAP values calculation
X_to_explain_ann = X_scaled_for_ann[:1500]
shap_values_ann = explainer.shap_values(X_to_explain)
print(f"Shape of X_to_explain_ann: {X_to_explain_ann.shape}")
print(f"Shape of shap_values_ann (list of arrays): {len(shap_values_ann)}")
if len(shap_values_ann) > 0:
print(f"Shape of shap_values_ann[0]: {shap_values_ann[0].shape}")
# DeepExplainer can return a list of arrays, where each array corresponds to one output.
# For a single-output model, it should be a list of one array: [shap_values_for_all_samples]
# However, shap_values_raw might be a list of per-sample explanations with shape (num_features, 1).
# So, we need to stack the individual (num_features,1) shap_values into a (num_samples, num_features) matrix.
shap_values_matrix = np.array([s.squeeze(axis=-1) for s in shap_values_ann])
# For summary_plot, we need features (X) to match the number of samples in shap_values_matrix
# So, we should also take the corresponding subset of X for feature names.
X_for_summary_plot = X.iloc[:X_to_explain_ann.shape[0], :]
# visualize the first prediction's explanation (use matplotlib=True to avoid Javascript)
# shap.force_plot(explainer.expected_value[0], shap_values_raw[0][0,:].squeeze(), X.iloc[0,:])
# Use X_for_summary_plot (subset of original DataFrame for column names) for the summary plot
shap.summary_plot(shap_values_matrix, X_for_summary_plot, plot_type="bar")
Shape of X_to_explain_ann: (1500, 11) Shape of shap_values_ann (list of arrays): 1500 Shape of shap_values_ann[0]: (11, 1)
Feature Imporatance - Model Optimized with Adam Optimizer¶
import shap
from tensorflow.keras import Sequential
import numpy as np
# load JS visualization code to notebook
shap.initjs()
# Scale the entire X DataFrame using the scaler fitted during previous steps
X_scaled_for_adm = scaler.transform(X)
# explain the model's predictions using SHAP
explainer = shap.DeepExplainer(model_adm,data=X_scaled_for_adm[:1500])
# Limit the number of samples to explain for the SHAP values calculation
X_to_explain_adm = X_scaled_for_adm[:1500]
shap_values_adm = explainer.shap_values(X_to_explain_adm)
print(f"Shape of X_to_explain_adm: {X_to_explain_adm.shape}")
print(f"Shape of shap_values_adm (list of arrays): {len(shap_values_adm)}")
if len(shap_values_ann) > 0:
print(f"Shape of shap_values_adm[0]: {shap_values_adm[0].shape}")
# DeepExplainer can return a list of arrays, where each array corresponds to one output.
# For a single-output model, it should be a list of one array: [shap_values_for_all_samples]
# However, shap_values_raw might be a list of per-sample explanations with shape (num_features, 1).
# So, we need to stack the individual (num_features,1) shap_values into a (num_samples, num_features) matrix.
shap_values_matrix = np.array([s.squeeze(axis=-1) for s in shap_values_adm])
# For summary_plot, we need features (X) to match the number of samples in shap_values_matrix
# So, we should also take the corresponding subset of X for feature names.
X_for_summary_plot = X.iloc[:X_to_explain_adm.shape[0], :]
# visualize the first prediction's explanation (use matplotlib=True to avoid Javascript)
# shap.force_plot(explainer.expected_value[0], shap_values_raw[0][0,:].squeeze(), X.iloc[0,:])
# Use X_for_summary_plot (subset of original DataFrame for column names) for the summary plot
shap.summary_plot(shap_values_matrix, X_for_summary_plot, plot_type="bar")
Shape of X_to_explain_adm: (1500, 11) Shape of shap_values_adm (list of arrays): 1500 Shape of shap_values_adm[0]: (11, 1)
Observation on the feature importance
All the 3 model showed that the following feature are the most important determining if a customers will churn or not churn;
- Number of Products
- Age
Business Insights and Conclusions:¶
From the data visualizations and SHAP analysis, the following have been deduced:
Male bank customers churn the most with a percentage of 54% compared to females who have a percentage of 46%.
The dataset is about the customers in Europe - France, Germany and Spain.
Bank customers whose geographical location is France are more likely to churn than those in Spain and Germany.
Customers who are using only 1 product are more likely to churn.
Bank customers with a Credit Card (around 70%) churn more than those who do not.
Customers who are maintaining an active bank account are more likely to attrite than those who are inactive.
Feature Importance (from SHAP Analysis):
- Balance is consistently the most important feature in predicting customer churn across all top-performing neural network models.
- Descriptive insight: Customers maintaining a $0 balance or balances between USD100,000 and USD125,000 are more likely to churn. Conversely, those with balances between USD100 and USD99,500, or USD130,000 and USD200,000 are less likely to churn.
- Age is consistently the second most important feature.
- Descriptive insight: Bank customers whose age is around 46 are more likely to churn.
- CreditScore is consistently the third most important feature.
- Descriptive insight: Bank customers with a credit score around 650 are more likely to churn.
- EstimatedSalary is consistently the fourth most important feature.
- Descriptive insight: Most customers (over 80%) have an estimated salary above USD50,000, with a significant group around USD175,000. This group is also observed to be more likely to churn.
- Balance is consistently the most important feature in predicting customer churn across all top-performing neural network models.
Bank customers whose tenure is 2 years with the bank have a larger churn percentage than other categories.
Recommendations:¶
It is very clear from the data analysis that the balance a customers maintained with the bank is a key indicator of whether they will churn or not. Given that customers with $0 balance or balances between USD100,000 and USD125,000 are more likely to churn, the bank should consider doing one or all of the following to incentivize customers to maintain more stable balances:
a. Offering interest on checking account balances.
b. Organizing contests for customers who maintain higher balances, with attractive prizes.
c. Marketing saving accounts with above-market interest rates as an incentive.
The Bank should proactively target customers whose estimated salary is above USD50,000 per annum. As these customers constitute over 80% of the dataset and are observed to be more likely to churn, the bank should organize aggressive marketing campaigns showcasing all available premium offers. These could include long moratorium credit cards, expansive cashback deals, generous mileage deals, competitive APRs, and aesthetically pleasing credit card designs.
For customers regularly using their credit cards, especially those with an estimated salary above USD50,000, the bank should also offer proactive credit limit increases. This is expected to boost credit card spending and transaction amounts, thereby discouraging churn.
Introducing 0% interest EMI (equated monthly installment) options on credit cards is another effective strategy. This would encourage customers to purchase higher-cost products with their credit cards and convert the expenditure into EMIs, increasing total transaction amounts and frequencies, and potentially impacting revolving balances positively.
Alongside existing card types, the bank can introduce credit cards specifically tailored for online shopping (with targeted cashback offers) or online food ordering (e.g., Doordash, Ubereats, Grubhub). This would significantly increase card usage and make it more challenging for customers to churn.
With our developed model, it is possible to accurately predict which customers are likely to churn. Based on the predicted churn probability, the marketing department should be empowered to reach out through targeted advertising schemes, focusing on at least the top 50-80% of high-risk customers. These campaigns should promote new credit card offers, increased credit limits, and newly introduced cashback programs. This can be initiated as a pilot program to retain these customers before wider implementation.
The bank should address customers with very poor credit scores (below 400), which was identified as an important feature. Free informational services should be provided on how these customers can improve their credit scores. Successful improvement could foster strong loyalty. Additionally, providing low credit limit credit cards could assist them in their credit-building journey.
While the initial univariate analysis observed male customers churning more than female customers, and customers from France showing higher churn, SHAP analysis did not identify Gender or Geography as primary drivers of churn in this model. Therefore, while further study into demographic information (like specific geographical trends influencing male customer behavior) might be considered, these factors are not the immediate focus for churn reduction based on the current model's feature importance.
Regarding customer tenure, which was found to have some impact (with 2-year tenure showing higher churn rates), the Bank should consider offering credit cards with extended 18 to 24-month 0% APR periods to all customers, irrespective of their balance or estimated salary. This could help retain customers across different tenure groups.
!pip install nbconvert
Requirement already satisfied: nbconvert in /usr/local/lib/python3.12/dist-packages (7.16.6) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (4.13.5) Requirement already satisfied: bleach!=5.0.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (6.3.0) Requirement already satisfied: defusedxml in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.7.1) Requirement already satisfied: jinja2>=3.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.6) Requirement already satisfied: jupyter-core>=4.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.9.1) Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.3.0) Requirement already satisfied: markupsafe>=2.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.0.3) Requirement already satisfied: mistune<4,>=2.0.3 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (3.1.4) Requirement already satisfied: nbclient>=0.5.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (0.10.2) Requirement already satisfied: nbformat>=5.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.10.4) Requirement already satisfied: packaging in /usr/local/lib/python3.12/dist-packages (from nbconvert) (25.0) Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (1.5.1) Requirement already satisfied: pygments>=2.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (2.19.2) Requirement already satisfied: traitlets>=5.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert) (5.7.1) Requirement already satisfied: webencodings in /usr/local/lib/python3.12/dist-packages (from bleach!=5.0.0->bleach[css]!=5.0.0->nbconvert) (0.5.1) Requirement already satisfied: tinycss2<1.5,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert) (1.4.0) Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.12/dist-packages (from jupyter-core>=4.7->nbconvert) (4.5.0) Requirement already satisfied: jupyter-client>=6.1.12 in /usr/local/lib/python3.12/dist-packages (from nbclient>=0.5.0->nbconvert) (7.4.9) Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (2.21.2) Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert) (4.25.1) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (2.8) Requirement already satisfied: typing-extensions>=4.0.0 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert) (4.15.0) Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (25.4.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (2025.9.1) Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.37.0) Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.28.0) Requirement already satisfied: entrypoints in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (0.4) Requirement already satisfied: nest-asyncio>=1.5.4 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.6.0) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (2.9.0.post0) Requirement already satisfied: pyzmq>=23.0 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (26.2.1) Requirement already satisfied: tornado>=6.2 in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (6.5.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.17.0)
%%shell
jupyter nbconvert --to html '/content/drive/My Drive/UTA - AIML/Deep Learning project/Samson_Akomolafe_Deep_Learning_Fullcode_Project_4c.ipynb'